GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on : 



Title: 

Perfect score: 
Sequence : 



March 26, 2004, 15:57:50 ; Search time 71.3326 Seconds 

(without alignments) 
4515.528 Million cell updates/sec 

US-10-092-390-2 
6744 

1 MVISLNSCLSFICLLLCHWI SSPKQEDSGGSSSNSSSSSE 1140 



1586107 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 1586107 seqs, 282547505 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 

Database : A_Genes eq_2 9 Jan04 : * 

1 : geneseqpl98 0s : * 

2: geneseqpl990s : * 

3 : geneseqp2000s : * 

4: geneseqp2 001s : * 

5: geneseqp2002s : * 

6: geneseqp2003as : * 

7: geneseqp2003bs : * 

8: geneseqp2004s : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 



Query 



ro. 


Score 


Match 


Length 


DB 


ID 


Descript 


l 


6744 


100, 


0 


1140 


5 


AAE27985 


Aae27985 


2 


6744 


100, 


0 


1140 


7 


ADD18688 


Addl8688 


3 


6744 


100 


0 


1192 


7 


ADE71305 


Ade71305 


4 


3601 


53 


4 


586 


5 


AAE27986 


Aae27986 


5 


3107.5 


46 


1 


878 


4 


ABG08033 


Abg08033 


6 


2506.5 


37 


2 


1050 


4 


AAB66267 


Aab66267 


7 


2482.5 


36 


8 


994 


5 


AAG79417 


Aag79417 


8 


1909 


28 


3 


636 


4 


7\AB66269 


Aab66269 


9 


1897 


28 


1 


1350 


6 


ADA21141 


Ada21141 



Human EGF 
Human dis 
Novel hum 
Human EGF 
Novel hum 
Human TAN 
CADHP-6, 
Rat TANGO 
Human sec 



10 


1879 


27, 


.9 


1577 


6 


ABJ37904 


Abj37904 


NOVX prot 


11 


1874 .5 


27, 


. 8 


1261 


7 


ADD78227 


Add78227 


Human CGD 


12 


1860 


27, 


. 6 


349 


6 


ABP75770 


Abp75770 


Human sec 


13 


1858 . 5 


27 , 


. 6 


1450 


6 


ABJ37901 


Abj37901 


NOVX prot 


14 


1847 . 5 


27, 


. 4 


739 


6 


ABU034 89 


Abu03489 


Angiogene 


15 


1770 


26, 


.2 


1403 


6 


ABJ37903 


Abj37903 


NOVX prot 


16 


1761 . 5 


26, 


, 1 


1398 


6 


ABJ37900 


Abj37900 


NOVX prot 


17 


1761.5 


26, 


, 1 


1404 


6 


ABJ37899 


Abj37899 


NOVX prot 


18 


1708 


25 , 


.3 


1097 


6 


ADA21140 


Ada21140 


Human sec 


19 


1522 


22 , 


, 6 


384 


4 


AAG75479 


Aag75479 


Human col 


20 


1466 


21 , 


, 7 


321 


4 


ABG27639 


Abg27639 


Novel hum 


21 


1272 


18 , 


, 9 


466 


4 


ABG22559 


Abg22559 


Novel hum 


22 


1252 


18 . 


. 6 


434 


4 


ABB66756 


Abb66756 


Dros ophil 


23 


1241 . 5 


18 , 


, 4 


762 


4 


ABG08032 


Abg08032 


Novel hum 


24 


1192 


17 , 


, 7 


474 


4 


AAY72 7 15 


Aay72715 


HFICU08 c 


25 


1169 


17 . 


. 3 


269 


4 


ABG08031 


Abg08031 


Novel hum 


26 


1169 


17 . 


. 3 


269 


6 


ABO00812 


Abo00812 


Polypepti 


27 


1169 


17 . 


. 3 


311 


6 


ABO00512 


Abo00512 


Novel hum 


28 


1034 . 5 


15 . 


. 3 


2444 


5 


ABB07821 


Abb07821 


Cons t i tut 


29 


1034.5 


15 . 


. 3 


2556 


2 


AAO27 0 66 


Aao27 066 


Human Not 


30 


103 4.5 


15 . 


, 3 


2556 


6 


ABG7 0518 


Abg70518 


Human pol 


31 


1034.5 


15 . 


, 3 


2556 


6 


AAG7 97 7 3 


Aag79773 


Human Not 


32 


1034 . 5 


15 . 


, 3 


2556 


6 


ABP72571 


Abp72571 


Human Not 


33 


103 4.5 


15 , 


, 3 


2556 


6 


ABR61830 


Abr61830 


Human Not 


34 


1034.5 


15 . 


, 3 


2556 


7 


ABR61759 


Abr61759 


Human Not 


35 


1024 


15 , 


. 2 


2531 


7 


ADE63713 


Ade63713 


Rat Prote 


36 


1024 


15 , 


. 2 


2531 


7 


ADE637 05 


Ade63705 


Rat Prote 


37 


1024 


15 . 


. 2 


2531 


7 


ADE63709 


Ade63709 


Rat Prote 


38 


1024 


15 . 


,2 


2531 


7 


ADE637 01 


Ade63701 


Rat Prote 


39 


1018 . 5 


15 . 


, 1 


661 


6 


ABU117 60 


Abull760 


Human MDD 


4 0 


1014.5 


15 . 


, 0 


1473 


5 


AAE'J R? 08 

± U L u u 


Aael 82 0 8 


Hiiman MOT. 


41 


1014.5 


15. 


,0 


1473 


7 


ADD18194 


Addl8194 


Human mol 


42 


1014.5 


15, 


.0 


2471 


2 


AAO2 7 0 65 


Aao27065 


Human Not 


43 


1014.5 


15. 


. 0 


2471 


6 


AAG79774 


Aag79774 


Human Not 


44 


1014 .5 


15, 


, 0 


2471 


6 


ABP72572 


Abp72572 


Human Not 


45 


1014 .5 


15. 


, 0 


2471 


6 


ABR61831 


Abr61831 


Human Not 



ALIGNMENTS 



RESULT 1 
AAE27985 

ID AAE27985 standard; protein; 1140 AA. 
XX 

AC AAE27985; 
XX 

DT 27-JAN-2003 (first entry) 
XX 

DE Human EGF-family protein #1. 
XX 

KW Human; EGF-family protein; novel human protein; NHP; drug discovery; 

KW restriction fragment length polymorphism analysis; forensic biology; 

KW toxicity; infectious disease; biological disorder; medical disorder; 

KW mental disorder; gene therapy. 
XX 

OS Homo sapiens. 



XX 

PN WO200272611-A2 . 
XX 

PD 19-SEP-2002. 
XX 

PF 06-MAR-2002; 2002WO-US007477 . 
XX 

PR 12-MAR-2001; 2 001US- 0275013P . 
XX 

PA (LEXI-) LEXICON GENETICS INC. 
XX 

PI Yu X, Miranda M; 
XX 

DR WPI; 2002-723315/78. 

DR N-PSDB; AAD46318. 
XX 

PT New novel human nucleic acids useful for e.g. identifying protein coding 

PT sequences and mapping unique genes to a particular chromosome,, as DNA 

PT markers for restriction fragment length polymorphism analysis , or in 

PT forensic biology. 
XX 

PS Claim 2; Page 37-40; 42pp; English. 
XX 

CC The present sequence is EGF-family protein, a novel human protein (NHP) . 

CC The NHP sequences are useful for mapping unique genes to a particular 

CC chromosome; as DNA markers for restriction fragment length polymorphism 

CC analysis; in forensic biology; in defining and monitoring both drug 

CC action and toxicity; in identifying, selecting and validating novel 

CC molecular targets for drug discovery; in microarrays or other assay 

CC formats to screen collections of genetic material from patients who have 

CC a particular medical condition. The NHP peptides, fusion proteins, 

CC antibodies, antagonists and agonists can be used for detecting mutant 

CC NHPs or inappropriately expressed NHPs for the diagnosis of disease; for 

CC screening drugs for treatment of symptomatic or phenotypic manifestations 

CC of perturbing the normal function of NHP in the body and to treat 

CC diseases including infectious, mental, biological, or medical diseases or 

CC disorders. They are also used in gene therapy 

XX 

SQ Sequence 1140 AA; 



Query Match 100.0%; Score 6744; DB 5; Length 1140; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 114 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSC 60 

I I I I I I I I I I I i I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! E I 

Db 1 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYS VTVQESYPHPFDQI YYTSC 60 

Qy 61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 

I I I I I I I I I I I I I I i I I I I I I M I I I I I I I I I I! I II I II I I I I I I I II I I I I I I I I I I I 

Db 61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 



Qy 121 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 18 0 

I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I M 1 I I I I I I I II I I I I I I I I I I I 

Db 121 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 180 

Qy 181 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 2 40 



Db 181 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 24 0 

Qy 241 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 

Qy 301 GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

I I I I I I I ! I I I I I I I I I I I I i I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 GYTGERCQDECPVGTYGVLCAETCQCWGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

Qy 361 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 420 

I I I I I I I I I I II I I I I I I I I I I I I I I I ! I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I 

Db 361 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 420 

Qy 421 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 4 80 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I i I I I i I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 421 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 4 80 

Qy 481 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAP GWRGEKCELPCQDGTYGLNCAE 540 

I I I i I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 1 I I I I I I I II I I I I I II I I I 
Db 481 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 54 0 

Qy 541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 600 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 600 

Qy 601 ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 660 

I I i I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 601 ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 660 

Qy 661 PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 72 0 

I I I I I I I I II I I I I I I I I I I I I II I I I i II I I I I I I I I I II I I I I I II I I I I I I I I I I I I 
Db 661 PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 72 0 

Qy 721 GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 780 

I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I 

Db 721 GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 780 

Qy 781 MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 840 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 840 

Qy 841 SLSRTSTALPADSYQIGAIAGIIILVLWLFLLALFII YRHKQKGKESSMPAVTYTPAMR 900 

I I I I I I I I II II I I II I I I II I I I I I I I I M I I I I I I I I I I I I II I I I I I I I I M I I I I I 

Db 841 S LS RTSTALPADSYQIGAIAGI 1 1 LVLWLFLLALFI I YRHKQKGKESSMPAVTYTPAMR 900 

Qy 901 WNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFV 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I 

Db 9 01 WNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFV 9 60 

Qy 961 NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 1020 

I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I 1 I I 1 I I I I I I I I I I I I I I I II I I I I I I 
Db 961 NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 1020 

Qy 1021 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 1080 

I I I I I I I I I I II I I I II I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II 



Db 



1021 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 10 8 0 



Qy 1081 VSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I ! I I I I I I I I I I I I ! I I I ! I i I I I 

Db 1081 VSWQGVFSNMGRLSQDPYDLPKNSHI PCHYDLLPVRDSSSSPKQEDSGGSS SNSSSS SE 1140 



RESULT 2 
ADD18688 

ID ADD18688 standard; protein; 1140 AA. 
XX 

AC ADD18688; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE Human disease related protein SeqID119. 
XX 

KW human; disease state; cytostatic; antiinflammatory; ophthalmological ; 

KW antiarteriosclerotic; vulnerary; gene therapy; 

KW hypoxia-regulated condition; tumourigenesis ; angiogenesis; apoptosis; 

KW inflammation; erythropoiesis ; glycolysis; gluconeogenesis; 

KW glucose transportation; catecholamine synthesis; iron transport; 

KW nitric oxide synthesis; cancer; ischaemic condition; reperfusion injury; 

KW retinopathy; neonatal stress; pre-eclampsia ; atherosclerosis; 

KW inflammatory condition; wound healing. 

XX 

OS Homo sapiens . 
XX 

PN WO2003018621-A2 . 
XX 

PD 06-MAR-2003. 
XX 

PF 23-AUG-2002; 2002WO-GB003892 . 
XX 

PR 23-AUG-2001; 2001GB- 0002 0558 . 

PR 05-OCT-2001; 2001GB-00024037 . 
XX 

PA (OXFO-) OXFORD BIOMEDICA UK LTD. 

XX 

PI Kingsman SM, White J, Ward NR, Harris RA, Naylor S, Mundy CR; 
XX 

DR WPI; 2003-290046/28. 

DR N-PSDB; ADD18689. 
XX 

PT New substantially purified polypeptide, useful for diagnosing or treating 

PT a hypoxia-regulated condition, such as cancer, ischemia, reperfusion 

PT injury, retinopathy, pre-eclampsia, atherosclerosis, inflammation, or 

PT wound healing. 
XX 

PS Claim 25; SEQ ID NO 119; 424pp; English. 
XX 

CC This invention relates to novel human genes and gene product which are 

CC implicated in certain disease states . Compounds which modulate the 

CC proteins of the invention may have cytostatic, antiinflammatory, 

CC ophthalmological, antiarteriosclerotic or vulnerary activities. The 

CC sequences of the invention may be useful for gene therapy. The invention 

CC may be useful for diagnosing or treating a hypoxia-regulated condition, 



CC such as tumourigenesis, angiogenesis , apoptosis, inflammation, 

CC erythropoiesis, or the biological response to hypoxia conditions 

CC including processes such as glycolysis, gluconeogenesis , glucose 

CC transportation, catecholamine synthesis, iron transport or nitric oxide 

CC synthesis. The disease includes cancer, ischaemic conditions, reperfusion 

CC injury, retinopathy, neonatal stress, pre-eclampsia, atherosclerosis, 

CC inflammatory conditions or wound healing. The present sequence is that of 

CC a disease related protein of the invention. 

XX 

SQ Sequence 1140 AA; 

Query Match 100.0%; Score 6744; DB 7; Length 1140; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1140; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 





j. 


wtct vrcf t qFTCT.T T.PHWT GT A S PT.NLFDPNVC S HWFS YSVTVOFS YP HP FOOT YYT SC 


60 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 I 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 




Db 


i 


MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQ1 YYTSC 


60 


Qy 


D ± 


lul i_uN vvc i\Ul r\rl r\Vo InlAl hnoljJA l n i. i\jax\o ^/k^^ n idj onii v i^ v n i 1 ^-t\lj v ii^rr-A — u/n. 


12 0 




i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i it I I I I I I I I I I I I M l l l l l l l I l l l 

M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 i 1 i 1 l 1 1 I 1 I 1 1 1 1 1 1 ) 1 1 1 1 1 1 1 ! 1 1 1 1 




Db 


61 


TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 


120 


Qy 


191 


PMTPnrFPCT^TWrs S AfnnnHWnPHrT SRrorKT\fGAT>rNPTTGArHCAAGFRGWRCED 


180 






II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 I 1 1 1 M 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 1 

1 1 1 1 I 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 j 1 1 1 1 1 1 1 1 1 1 1 i 1 j 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 


180 


yy 


1 pi 

_L U JL 


RrFOGTYGNDCHORrorONGATCnHvTGFCRCPPGYTGAFCEDLCPPGKHGPOCEORCPC 


240 






1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

1 M M M II 1 1 1 1 M II 1 1 1 1 1 1 1 1 M 1 1 II II 1 1 1 M M 1 I II 1 1 1 1 II 1 1 1 1 M M 1 1 




Db 


181 


RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 


240 


Qy 


Z. rt J_ 


nNGGVrHHWGFrSCPSGWMGTVCGOPCPEGRFGKNCSOECOCHNGGTCDAATGOCHCSP 


300 






1 1 II 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 S 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 i ! 1 1 1 1 i 1 1 1 1 

I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ] 1 1 1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 I 1 1 1 1 I 1 1 1 1 




Db 


241 


QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 


300 




301 


GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 


360 






1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 

1 ! 1 1 ; : 1 { ] J 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ' 1 1 1 I' I' llllllllllllllllllllll lllll 




Db 


301 


GYTGERCQDECPVGTYGVLCAETCQCWGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 


360 


Qy 


361 


GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 


420 




1 1 1 1 1 1 I 1 1 I 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 II I 1 1 1 1 M 1 1 1 1 1 1 i 1 1 1 1 




Db 


361 


GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 


420 


Qy 


421 


CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCS SRCGCKNDAVCSPVDGSCTCKAGWHGV 


480 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 I 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


CDSWGKCTCAPGFKGIDCSTPCPLGTYGINCS SRCGCKNDAVCSPVDGSCTCKAGWHGV 


480 


Qy 


481 


DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 


540 






1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 




Db 


481 


DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 


540 


Qy 


541 


RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 


600 




1 I 1 I I I I I I I | M I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 I 1 1 1 1 1 1 1 1 II 1 1 1 1 i 1 1 1 1 




Db 


541 


RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 


600 


Qy 


601 


ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 


660 






1 1 1 1 1 1 I I | 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 i 1 1 1 1 M 





Db 



601 



ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 



660 



Qy 661 PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 720 

I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I I I I I I 1 M I I 

Db 661 PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 720 

Qy 721 GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 780 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 780 

Qy 781 MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 8 40 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I i I I I I I I I I 

Db 781 MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 840 

Qy 841 SLSRTSTALPADSYQIGAIAGIIILVLWLFLLALFIIYRHKQKGKESSMPAVTYTPAMR 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I I I II I I I I 

Db 841 S L S RT S TAL PAD S YQ I GAI AG III LVLWL F LLAL F 1 1 YRHKQKGKE S SMP AVT YT P AMR 900 

Qy 901 WNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFV 960 

I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 901 WNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFV 960 

Qy 961 NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 1020 

I I II I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I 

Db 961 NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 1020 

Qy 1021 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTS7VMRNVYEVEPT 1080 

I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 

Db 1021 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 1080 

Qy 1081 VSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 

I I I I I I I I I I I I I I i I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I II I I II I I I I I I 

Db 1081 VSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 



RESULT 3 
ADE71305 

ID ADE71305 standard; protein; 1192 AA. 
XX 

AC ADE71305; 
XX 

DT 29-JAN-2004 (first entry) 
XX 

DE Novel human protein #59. 
XX 

KW human; novel protein; drug. 
XX 

OS Homo sapiens. 
XX 

PN JP2002345493-A. 
XX 

PD 03-DEC-2002. 
XX 

PF 29-MAR-20Q1; 2 002 JP-0004 904 6 . 
XX 

PR 29-MAR-2001; 2001 JP-00095524 . 
XX 



PA (KAZU-) ZH KAZUSA DNA KENKYUSHO. 
XX 

DR WPI; 2003-460885/44. 

DR N-PSDB; ADE71243. 
XX 

PT A gene and a protein encoded by it, used in drugs. 
XX 

PS Disclosure; Page 242-247; 257pp; Japanese. 
XX 

CC The invention comprises the amino acid and coding sequences of novel 

CC human proteins. The DNA and protein sequences of the invention are used 

CC in drugs. The present amino acid sequence represents a novel human 

CC protein of the invention. 
XX 

SQ Sequence 1192 AA; 

Query Match 100.0%; Score 6744; DB 7; Length 1192; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1140; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSC 60 

I I I I M I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I i I I I 1 I I I I I I I I 
53 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSC 112 

61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
113 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 172 

121 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I 
173 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 2 32 

181 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCP PGYTGAFCEDLCPPGKHGPQCEQRCPC 2 4 0 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
2 33 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCP PGYTGAFCEDLCPPGKHGPQCEQRCPC 2 92 

241 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I! I I I I I 

2 93 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 352 

3 01 GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

I I I I I I I I I I 1 I I I I I I I I I II I I I I II I I I I I I I I 1 I I I I I I I I I I I I I I I ! I I I 1 I M 
353 GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 412 

3 61 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 42 0 

I I I I I I I I I I I I I I II M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

413 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 4 72 

421 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCS SRCGCKNDAVCSPVDGSCTCKAGWHGV 480 

I I I II I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I II 
473 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCS SRCGCKNDAVCSPVDGSCTCKAGWHGV 532 

481 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 54 0 

I I I I I I I I II I I I I I I I I I II I I I II I I I I I II I I I I I I M II I I I I I II I I I I I I I I I I 

53 3 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 592 
541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 60 0 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



Db 5 93 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 652 

Qy 601 ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 660 

I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I II I I I I I I II I I I I I I I I I I I 
Db 653 ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 712 

Qy 661 PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 720 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 713 PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 772 

Qy 721 GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 780 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I ! I I 1 I I I I I I I I I I I I I I 

Db 773 GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 832 

Qy 781 MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 840 

I I II I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 
Db 833 MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 892 

Qy 841 SLSRTSTALPADSYQIGAIAGIIILVLWLFLLALFI1YRHKQKGKESSMPAVTYTPAMR 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 893 SLSRTSTALPADSYQIGAIAGIIILVLWLFLLALFIIYRHKQKGKESSMPAVTYTPAMR 952 

Qy 901 WNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFV 960 

I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I 
Db 953 WNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFV 1012 

Qy 961 NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I M I I I I I I 

Db 1013 NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 1072 

Qy 1021 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 1080 

I I I I I I I I I I I I I I I I II I II I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

Db 1073 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 1132 

Qy 1081 VSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 

I I I I I I I I I I I I II II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I 

Db 1133 VSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1192 



RESULT 4 
AAE27986 

ID AAE27986 standard; protein; 586 AA. 
XX 

AC AAE27986; 
XX 

DT 27-JAN-2003 (first entry) 
XX 

DE Human EGF-family protein #2. 
XX 

KW Human; EGF-family protein; novel human protein; NHP; drug discovery; 
KW restriction fragment length polymorphism analysis; forensic biology; 
KW toxicity; infectious disease; biological disorder; medical disorder; 
KW mental disorder; gene therapy. 
XX 

OS Homo sapiens . 
XX 



PN WO200272611-A2 . 
XX 

PD 19-SEP-2002. 

XX 

PF 06-MAR-2002; 2 002WO-US 0 07477 . 
XX 

PR 12-MAR-2001; 2 001US-02 7 5013P . 
XX 

PA { LEXI- ) LEXICON GENETICS INC. 
XX 

PI Yu X, Miranda M; 
XX 

DR WPI; 2002-723315/78. 

DR N-PSDB; AAD46319. 
XX 

PT New novel human nucleic acids useful for e.g. identifying protein coding 

PT sequences and mapping unique genes to a particular chromosome, as DNA 

PT markers for restriction fragment length polymorphism analysis, or in 

PT forensic biology. 
XX 

PS Claim 2; Page 40-42; 42pp; English. 
XX 

CC The present sequence is EGF-family protein, a novel human protein (NHP) . 

CC The NHP sequences are useful for mapping unique genes to a particular 

CC chromosome; as DNA markers for restriction fragment length polymorphism 

CC analysis; in forensic biology; in defining and monitoring both drug 

CC action and toxicity; in identifying, selecting and validating novel 

CC molecular targets for drug discovery; in microarrays or other assay 

CC formats to screen collections of genetic material from patients who have 

CC a particular medical condition. The NHP peptides, fusion proteins, 

CC antibodies, antagonists and agonists can be used for detecting mutant 

CC NHPs or inappropriately expressed NHPs for the diagnosis of disease; for 

CC screening drugs for treatment of symptomatic or phenotypic manifestations 

CC of perturbing the normal function of NHP in the body and to treat 

CC diseases including infectious, mental, biological, or medical diseases or 

CC disorders . They are also used in gene therapy 

XX 

SQ Sequence 586 AA; 



Query Match 53.4%; Score 3601; DB 5; Length 586; 

Best Local Similarity 100. 0%; Pred. No. 4.8e-170; 

Matches 586; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I 1 I I I I I I 11 I I I I I I I I I I i I I I I 
Db 1 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSC 60 

Qy 61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 

I I I I I I I I I I I I I I I I I I I I I I I 1 I I M I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I 

Db 61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 



Qy 121 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 18 0 

I I I I I I I I 1 I I I I I I I I I ! 1 I I I I I I I I i I I I I I I I I I I I I I I I I 1 I II I I i I I I I I 1 I I 

Db 121 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 180 

Qy 181 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 181 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 24 0 

Qy 241 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 

I I I I I I I I I I I I I I I I I I I I I I 1 I I II I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I 

Db 241 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 

Qy 301 GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGEAGERCEARLCPEGLY 3 60 

I I I I I I I I II I I I I I I II I I I I I I I I II I I I I I ! I I I I I I I I I I I I I I I M I I I I I I I I I 

Db 301 GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

Qy 361 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 420 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 361 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 420 

Qy 421 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 480 

i I I ! I I I II I I I I I I I I I I I I I I I II I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 4 80 

Qy 481 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 540 

I I I || I I I I I I I I I I I I II I I I I M I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I 

Db 481 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 540 

Qy 541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCY 586 

I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCY 586 



RESULT 5 




ABGO! 


3033 




ID 


ABG08033 standard; protein; 878 AA. 




XX 






AC 


ABG08033; 




XX 






DT 


13-FEB-2002 (first entry) 




XX 






DE 


Novel human diagnostic protein #8024. 




XX 






KW 


Human; chromosome mapping; gene mapping; gene 


therapy; forensic; 


KW 


food supplement; medical imaging; diagnostic; 


genetic disorder. 


XX 






OS 


Homo sapiens. 




XX 






PN 


WO200175067-A2 . 




XX 






PD 


ll-OCT-2001. 




XX 






PF 


30-MAR-2001; 2 001WO-US008 631 . 




XX 






PR 


31-MAR-2000; 2 0 0 0US- 0 054 02 17 . 




PR 


23-AUG-2000; 2000US-0 064 91 67 . 




XX 






PA 


(HYSE-) HYSEQ INC. 




XX 






PI 


Drmanac RT, Liu C, Tang YT ; 




XX 






DR 


WPI; 2001-639362/73. 




DR 


N-PSDB; AAS72220. 





PT New isolated polynucleotide and encoded polypeptides , useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS Claim 20; SEQ ID NO 38392; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II) . The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II). (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. ABG0 0010-ABG3037 7 represent novel human diagnostic 

CC amino acid sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences 

XX 

SQ Sequence 878 AA; 

Query Match 46.1%; Score 3107.5; DB 4; Length 878; 

Best Local Similarity 59.5%; Pred. No. 1.6e-145; 

Matches 594; Conservative 15; Mismatches 69; Indels 321; Gaps 13 

Qy 338 ACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLEN--THSCHPMSGECACKPGWSGLY 395 

III: : I I I I i : I I I I I I I I I I I II I I I I I I 

Db 5 ALLCQLTYA C 1 SAQLICPFAMEQQLVACCHPMSGECACKPGWSGLY 50 

Qy 396 CNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGEKGIDCSTPCPLGTYGINCSSR 455 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

Db 51 CNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGEKGIDCSTPCPLGTYGINCSSR 110 

Qy 456 CGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCT 515 

I I I I I! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II ! I I I I I I 

Db 111 CGCKNDAVCS PVDGS CTCKAGWHGVDCS I RCPSGTWGFGCNLTCQCLNGGACNTLDGTCT 17 0 

Qy 516 CAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEG 575 

I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 1 I I I I I I II I I I 

Db 171 CAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEG 230 

Qy 576 RWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSS 635 

II I I I I I I 1 I I I I I I I I II I I I I I I I I I I I I I I I M II I I I I I I I I I I II I I I M I I I I I 

Db 231 RWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSS 290 



Qy 



63 6 GPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWI 695 



I I I I I I I 1 I I I I I I I I I i I ! II i II: : : : 

Db 291 GPCHHITGLCDCLPGFTGALCNEA YSQCVPVADLGKTVQEFV 332 

Qy 696 GSDCSQPCPP AHWGPNCIH TCNCHNGAFCSAYDGECKCTP — 735 

: : : I I I I : I I : I I I 

Db 333 P APTT E P VT P LT DLVS VT P VGLAVT ALNHVH LPT GAQT AS T RAT AIM ELSAAPTM 387 

Qy 736 GWTGL YCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTG 779 

III I i I I I I I I II I I I I I I I I i I I I I I I I I II I I 

Db 388 GNVNALLAGQGSTALRRSPRHSCRAAAS PWFYGKDCALICQCQNGADCDHI SGQCTCRTG 447 

Qy 780 FMGRHCEQK 7 88 

I I I I I I I I I 

Db 44 8 FMGRHCEQKVRPPWDHRWLLTALGGGGVTTRMKTEFKFSILFFWALPSSPSYFWNV7VAQS 507 

Qy 789 788 

Db 508 LKRSSRAFFMAEAEPGSHIGGQYIRWGGGLVAQGQSLLLPCAVWTVSATMIPGMLSSSGT 567 

Qy 789 CPSGTYGYGCRQIC 802 

I I I I I II I I I I II I 

Db 568 LLGVQVSLNRNPLKGLSSRCAGLAVRDSLAPNSQGWKATFDFPSLECPSGTYGYGCRQIC 627 

Qy 8 03 DCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALPADSYQIGAIAGI 8 62 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 628 DCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALPADS YQIGAIAGI 687 

Qy 863 I ILVLWLFLLALFI I YRHKQKGKES SMPAVTYTPAMRWNADYT I SGTLPHSNGGNANS 922 

I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 688 I I LVLVVLFLLALFI I YRHKQKGKES SMPAVTYT PAMRVVN AD YTI SGTLPHSNGGNANS 747 

Qy 923 HYFTNPSYHTLTQCATSP-HVNNRDRMTVTKSKNNQLFVNLKNVNPGKRGPVGDCTGTLP 981 

I I I I I I I I I I I I I I I I I I I I I I I : I 
Db 74 8 HYFTNPSYHTLTQCATSPSRSTTGDRMTVHEFKK 7 81 

Qy 982 ADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSNCSLSSSENPYATIKDPPVLIP 1041 

Db 782 781 

Qy 1042 KSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPTVSWQGVFSNNGRLSQDPYDL 1101 

: I : I II I II I ! I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 7 82 QSTVC--ESMKSPARRDSPYAEINNSTSANRNVYEVEPTVSWQGVFSNNGRLSQDPYDL 83 9 

Qy 1102 PKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ll 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 

Db 840 PKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 878 



RESULT 6 
AAB66267 

ID AAB66267 standard; protein; 1050 AA. 
XX 

AC AAB662 67; 
XX 

DT 05-APR-2001 (first entry) 
XX 

DE Human TANGO 272 SEQ ID NO: 14. 



XX 

KW Membrane associated protein; secreted protein; human; mouse; rat; 

KW INTERCEPT 340; MANGO 003; MANGO 347; TANGO 272; TANGO 295; TANGO 354; 

KW TANGO 378; skeletal disorder; cardiovascular disorder; renal disorder; 

KW haematopoietic disorder; neural disorder; hepatic disorder; 

KW neoplastic disease. 

XX 

OS Homo sapiens. 
XX 

PN WO200100673-A1 . 
XX 

PD 04-JAN-2001. 

XX 

PF 29-JUN-2000; 2 000WO-US018 198 . 
XX 

PR 30-JUN-1999; 99US-00345464 . 
XX 

PA (MILL-) MILLENNIUM PHARM INC. 
XX 

PI Barnes TM, Fraser CC, Wrighton N, Myers P, Busfield SJ, Sharp JD; 
XX 

DR WPI; 2001-050128/06. 

DR N-PSDB; AAF27787. 
XX 

PT Isolated secreted or transmembrane proteins are used for diagnosis and 

PT treatment of neoplastic and hematopoietic disorders e.g. T cell 

PT disorders , cancer and tumors. 
XX 

PS Claim 9; Page 227-229; 294pp; English. 
XX 

CC The present invention provides the protein and coding sequences for a 

CC number of membrane associated and secreted proteins from human, mouse and 

CC rat. The proteins are designated INTERCEPT 340, MANGO 003, MANGO 347, 

CC TANGO 272, TANGO 295, TANGO 254 and TANGO 378. The proteins are all 

CC involved in signal transduction and the sequences can be used in the 

CC treatment of cardiovascular, renal, hepatic, neural, neoplastic, skeletal 

CC and haematopoietic disorders 

XX 

SQ Sequence 1050 AA; 



Query Match 37.2%; Score 2506.5; DB 4; Length 1050; 

Best Local Similarity 40.5%; Pred. No. 8.4e-116; 

Matches 490; Conservative 111; Mismatches 345; Indels 263; Gaps 30; 

Qy 14 LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW 66 

III : I II I I I I I I I I : : I : I I : II : I I 

Db 9 LLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCE--RPWEGPHTCP 66 

Qy 67 FKCTRHRVSYR TAY 8 0 

I I : I : 

Db 67 SPQTQRKLLASRDSFCMVCVGAGVQWRDRSALQPQTGNALSMRPQPRVLSGAPSLASPGH 12 6 

Qy 81 RHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGWGGTNCSSA-- 138 

I I : I : : I I I I I I I t I I I I I : I I I II I : I I I I II I I I I : I I I I 

Db 127 TVWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGWRGDDCSSAPN 186 



13 9 CDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQR 194 



I : : I I I I I I I : I I I : I I I I I I I I I : I I I I 
Db 187 CLQPCTPGYYGPACQFRCQC-HGAPCDPQTGACFCPAERTGPSCDVSCSQGT 237 

Qy 195 CQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGECS 254 

: I I I II I I I I I I I III 

Db 238 SGFFC PSTH PCQNGGVFQTPQGSCS 262 

Qy 255 CPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVG 314 

II I I I I I : I I I I I I I I I I I I I : I I I I I II I I I I I : I I I I I : II :: I I I I I 

Db 263 CPPGWMGTICSLPCPEGFHGPNCSQECRCHNGGLCDRFTGQCRCAPGYTGDRCREECPVG 322 

Qy 315 TYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLEN 374 

: I I I I I I I : I : : I I I I I I II I : I I I I I I : I I I : I II I : 

Db 323 RFGQDCAETCDCAPDARCFPANGACLCEHGFTGDRCTDRLCPDGFYGLSCQAPCTCDREH 382 

Qy 375 THSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGF 4 34 

: I I I i I : I I I : 1 I I I : I I : I I I : I : I I I : I I : I I : : I I I I I I : 

Db 383 SLSCHPMNGECSCLPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGVCQATSGLCQCAPGY 442 

Qy 435 KGIDCSTPCPLGTYGINCSSRCGCKNDAVCS PVDGSCTCKAGWHGVDCSIRCPSGTWGFG 494 

I I : : I I I I I : I II : I I I : I I I I : I I I I I II : I I : II II I I I 
Db 443 TGPHCASLCPPDTYGVNCSARCSCENAIACSPIDGECVCKEGWQRGNCSVPCPPGTWGFS 502 

Qy 495 CNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTT 554 

I I : I I I : I : I I II I I I I I : I I I I : I I I I I II I : I I I I 
Db 503 CNASCQCAHEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCDCDHSDGCDPVH 562 

Qy 555 GHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRI 614 

I I : I III I I I I 1 I I I I I I I I I : I I :: I I I I I I I I I : I I I 
Db 563 GRCQCQAGWMGARCHLSCPEGLWGVNCSNTCTCKNGGTCLPENGNCVCAPGFRGPSCQRS 622 

Qy 615 CSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICT 674 

I I I I I I I : I I 
Db 62 3 CQPGRYGKR CVP CK 63 6 

Qy 675 CTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCT 734 

II: | : | : : I I I I I I I I I I I I I I I I I I I I I I : I I I I I I 
Db 637 CANHSFCHPSNGTCYCLAGWTGPDCSQPCPPGHWGENCAQTCQCHHGGTCHPQDGSCICP 696 

Qy 735 PGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTY 7 94 

I I I I : I : I I I I : I : I : III I I II 
Db 697 LGWTGHHCLEGCPLGTFGANCSQPCQCGPGEKC HPE 732 

Qy 795 GYGC RQ I CD CLNN S T C DH I T GT CYC S P GWKGARCDQAGVI I VGNLN S L S RT S T AL PAD — 852 

I I I I I I I I I : | : | : | 

Db 733 TGACVCPPGHSGAPCR IG IQEPFTVMPTTPV 763 

Qy 853 SY-QIGAIAGIIILVLWLFLLALFIIYRHKQKGKESSMPAVTYTPAMRWNADYTISGT 911 

: I : I I : I I : I : I : I : I I I I I I I I I I I I III: I : : : I : 

Db 764 AYNSLGAVIGIAVLGSLWALVALFIGYRHWQKGKEHHHLAVAYSSG-RLDGSEYVMPDV 822 

Qy 912 LPHSNGGN7\NSHYFTNPSYHTLTQCATSPHWNRDRNTVTKSKNNQLFVNLKN-VNPGKR 970 

I : I I I : : i I I I I i I : I I : : I I : I I : I : I II 

Db 823 PP SYSHYYSNPSYHTLSQCSPNPPPPNK VPGPLFASLQNPERPG-- 866 

Qy 971 GPVG-DCTGTLPADWKH GGYLNELGAFGLDRSYMGKSL KDLGKNSEY 1016 

III I I I I I I I I I I : I : I I I I I II I 



Db 



8 67 GAQGHDNHTTLPADWKHRREPPPGPLDR-GSSRLDRSYSYSYSNGPGPFYDKGLISEEEL 92 5 



Qy 1017 NSSNCSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYE 1076 

: I I I I I I I I I I I I : I I I I I : I I I I : I I I 
Db 926 GASVASL-SSENPYATIRDLPSLPGGPRES5YMEMKGPPSGSAPRQPPQFWDSQRRR 981 

Qy 1077 VEPTVSWQGVFSNNGRL SQDP YDLPKNSHI PCHYDLLPVRDS 1119 

: I I : I II I I I I I I I I I I II I I II I 

Db 982 -QPQPQRDSGTYEQPSPLIHDRDSVGSQPPLPPGLPPGHYDSPKNSHIPGHYDLPPVRHP 1040 

Qy 1120 SSSP-KQED 1127 

I I : : : I 
Db 1041 PSPPLRRQD 1049 



RESULT 7 
AAG79417 

ID AAG79417 standard; protein; 994 AA. 
XX 

AC AAG7 9417; 
XX 

DT 25-OCT-2002 (first entry) 
XX 

DE CADHP-6, Incyte ID No: 4097936CD1. 
XX 

KW Human; cell adhesion protein; CAD HP ; AIDS; Alzheimer's disease; 

KW acquired immunodeficiency syndrome; thymic dysplasia; epilepsy; 

KW renal tubular acidosis; congenital glaucoma; cancer; atherosclerosis; 

KW Parkinson's disease. 

XX 

OS Homo sapiens. 



XX 

EH Key Location/Qualifiers 

FT Domain 1. .60 9 

FT /label= Sushi_repeat 

FT /note= "Identified by BLAST-DOMO" 

FT Peptide 1. .29 

FT /label- Signal_peptide 

FT /note- "Identified by HMMER" 

FT Peptide 1. .28 

FT /label= Signal_peptide 

FT /note= "Identified by HMMER" 

FT Peptide 1. .25 

FT /label= Signal_peptide 

FT /note= "Identified by HMMER" 

FT Peptide 1. .24 

FT /label= Signal_peptide 

FT /note- "Identified by HMMER" 

FT Peptide 1. .22 

FT /label= Signal_peptide 

FT /note= "Identified by HMMER" 

FT Peptide 1. .20 

FT /label- Signal ^peptide 

FT /note- "Identified by HMMER" 

FT Peptide 1. .20 

FT /label- Signal_cleavage 

FT /note- "Identified by SPSCAN" 



FT 


Peptide 




1 . .19 


FT 






/label= Signal peptide 


FT 






/note= "Identified by HMMER" 


FT 


P t i Hp 




I m ,18 


FT 

L 1 






/1^)V^p1= Sicrnal neot i de 

/ _L CLXJ CJ„ >J J. U 11Q J. ^> -L- — 


FT 






/note= "Identified by HMMER" 


FT 


Ppni — i Hp 




1 . .16 


FT 






/label= Signal peptide 


FT 






/note= "Identified bv HMMER" 

/ 11 W L~ JL \A. v — • J- J- -L* J U ^ Jo^ V J. J>X XJ. J. j_| 1 "\. 


E 1 


1 V 1LJL4._L J LCU 


■ s i t e 


30 


FT 






/note= "Potentially phosphorylated" 


FT 

£ X 


MnH "i f*i pH- 


• s i t e 


38 


FT 

C 1 






/nntp= 11 Pot pn t i a 1 1 v nhosDhorvlated" 


FT 


U WlLLd J_ 1 1 




101 . .131 


FT 






/ 1 a hp1 — F.fJF— 1 i kp doma i n 

/ JL_ Cl U v^. _X_ J— 1 J- J L J\ ^-X ' X L L LX 


FT 






/ nn -h^= "THpntifipd bv HMMER-PFAM" 


FT 

£ 1 






12 0. .131 


FT 






/ 1 ^iV^pl = f,RF- 1 i kp domai n si crnature 2 


r 1 






/notp= 11 THpn t i f i pH bv MOTIFS" 


FT 


Domain 




IPO 131 


FT 






/label" EGF— like domain signature 1 


FT 






/nnrp= "THpntifipd bv MOTIFS" 


FT 

£ X 


Binding-site 


127. .129 


FT 






/] aKpi = fpl l attachemnt secruence 


FT 
r l 






/note= "Identified bv MOTIFS" 


FT 
r x 


P pn 1 — i Hp 




133 . .161 


£ 1 






/l ahpl = Tunp TTT F.fJF— likp nnafurp 

/ _LdJJ"_L — £ y CZ J 1 — L LiuL J — L JVC DJ-LjllaLUlC 


FT 






/ nn f P - "THpntifipd bv BLIMPS- PRINTS 


FT 

£ X 


Doma. in 

J-/ V«/X L LL-t _X. X X 




138 . .57 6 


FT 
£ 1 






/1^iV^p1= ^ll^Vli rpnpat" 
/ _LCUVJ~_L OUDlll. JL cpca L- 


FT 






/note= "Identified by BLAST-DOMO" 


FT 
£ 1 


Domain 




144 . . 174 


FT 






/label= EGF-like domain 


FT 

£ X 






/note- "Identified bv HMMER-PFAM" 


FT 

£ X 


Modified- 


-site 


152 


FT 






/nntp= "Pntpnt i a 1 1 v crl vcos via ted" 


FT 


Modif ied- 


-site 


1 J J 


FT 

£ x 






/nntp= "Potential lv al vcos via t ed" 


FT 

£ 1 


Modif ied- 


-site 


154 


FT 






/ nn -|-p— " Pot pn t i a 1 1 v nhosDhorvlated" 

/ 11 W L- J- w" L- v — 11 L _L CJL _l L, y kwV X X L^ X X v_> J_ ^y ^L ^ ^ 


FT 

£ X 


Domain 




187 . .216 


FT 






/label= FGF— like domain 

/ JL CX JJ _L UvJL X X 4\L. LIW1LLUX11 


FT 






/note= "Identified bv HMMER-PFAM" 

/ 11 \-s L- JL Vwi v^J 11 L- _L. JJ, JL. L— >_V V X XX XX XJ—i X \. X. J_ X u. J- 


FT 

£ £ 


Domain 




205 216 


FT 

L X 






/label= EGF— like domain signature 1 


FT 






/note= "Identified by MOTIFS" 


FT 


Domain 




229. .259 


FT 






/label= EGF-like domain 


FT 






/note= "Identified bv HMMER-PFAM" 

/ 11 \y L- v— ' JL V-A ^ X X L- Jl, JL. JL - XV V X XX iX XJ— 1 X \ i— J— Jl J> 


FT 


Domain 




248. .259 


FT 






/label= EGF-like domain signature 1 


FT 






/note= "Identified by MOTIFS" 


FT 


Domain 




248. .259 


FT 






/label= EGF-like domain_signature__2 


FT 






/note- "Identified by MOTIFS" 


FT 


Domain 




248. .252 


FT 






/label= Sushi domain_protein 



FT /note= "Identified by BLIMPS-PFAM" 

FT Modified-site 271 

FT /note- "Potentially glycosylated" 

FT Domain 272. .302 

FT /label= EGF-like_domain 

FT /note- "Identified by HMMER-PFAM" 

FT Peptide 284. .302 

FT /label- Type_III_EGF-like_signature 

FT /note- "Identified by BLIMPS-PRINTS" 

FT Domain 291. .302 

FT /label- EGF-like__domain_signature__l 

FT /note- "Identified by MOTIFS" 

FT Domain 291. .302 

FT /label- EGF-like_domain_signature_2 

FT /note- "Identified by MOTIFS" 

FT Domain 315. .345 

FT /label- EGF-like_domain 

FT /note- "Identified by HMMER-PFAM" 

FT Domain 334. .345 

FT /label- EGF-like_domain_signature_l 

FT /note- "Identified by MOTIFS" 

FT Domain 334. .345 

FT /label- EGF~like_domain_signature_2 

FT /note- "Identified by MOTIFS" 

FT Modified-site 346 

FT /note- "Potentially phosphorylated" 

FT Modified-site 355 

FT /note- "Potentially phosphorylated" 

FT Domain 365. .391 

FT /label- EGF-like_domain 

FT /note- "Identified by HMMER-PFAM" 

FT Domain 380. .391 

FT /label- EGF-like_domain_signature_l 

FT /note- "Identified by MOTIFS" 

FT Domain 380. .391 

FT /label- EGF-like_domain_signature_2 

FT /note- "Identified by MOTIFS" 

FT Modified-site 392 

FT /note- "Potentially glycosylated" 

FT Domain 404. .434 

FT /label- EGF-li ke_domain 

FT /note- "Identified by HMMER-PFAM" 

FT Domain 423. .434 

FT /label- EGF-like_domain_signature_l 

FT /note- "Identified by MOTIFS" 

FT Domain 423. .434 

FT /label- EGF-like_domain_signature_2 

FT /note- "Identified by MOTIFS" 

FT Modified-site 446 

FT /note- "Potentially glycosylated" 

FT Domain 447. .477 

FT /label- EGF-like_domain 

FT /note- "Identified by HMMER-PFAM" 

FT Modified-site 448 

FT /note- "Potentially phosphorylated" 

FT Modified-site 460 

FT /note- "Potentially phosphorylated" 



i 



FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 



Domain 

Modi f ied-site 
Domain 

Modif ied-site 
Domain 

Domain 

Domain 

Modif ied-site 
Domain 

Domain 

Modif ied-site 
Modif ied-site 
Domain 

Modif ied-site 
Domain 

Domain 

Domain 

Domain 



466. .477 

/label- EGF-like_domain_signature_2 

/note- "Identified by MOTIFS 11 

476 

/note- "Potentially glycosylated" 
490. .520 

/label- EGF-like_domain 

/note= "Identified by HMMER-PFAM" 

491 

/note- "Potentially glycosylated" 
509. .520 

/label- EGF-like_domain__signature_l 
/note- "Identified by MOTIFS" 
509. .520 

/label- EGF-like_domain_signature_2 
/note- "Identified by MOTIFS" 
533. .563 

/label- EGF-like_domain 

/note- "Identified by HMMER-PFAM" 

535 

/note- "Potentially phosphorylated" 
552. .563 

/label- EGF-like_domain_signature_l 
/note- "Identified by MOTIFS" 
552. .563 

/label- EGF-like_domain_signature_2 

/note- "Identified by MOTIFS" 

566 

/note- "Potentially phosphorylated" 
575 

/note- "Potentially glycosylated" 
576. .606 

/label- EGF-like_domain 

/note- "Identified by HMMER-PFAM" 

581 

/note- "Potentially phosphorylated" 
595. .606 

/label- EGF-like_domain_signature_l 
/note- "Identified by MOTIFS" 
595. .606 

/label- EGF-like_domain_signature_2 
/note- "Identified by MOTIFS 11 
603. .614 

/label- Sushi_domain_protein 
/note- "Identified by BLIMP S-PFAM" 
619. .648 



Query Match 36.8%; Score 2482.5; DB 5; Length 994; 

Best Local Similarity 41.6%; Pred. No. 1.2e-114; 

Matches 482; Conservative 110; Mismatches 347; Indels 221; Gaps 



27; 



QY 
Db 

Qy 



14 LLLCHWIGTASPLNLEDPNVCSHWESYS VTVQESYPHPFDQI YYTSCTDILNW FKCT 7 0 

III : I II I I I I i I I I : : I : I I : II : I I I 

9 LLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCE — RPWEGPHTCP 66 

71 RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 13 0 
: I III II II : I : : III I I I I I III II : M I I i I : I II III I I I 



Db 67 QPTWYRTVYRQWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGW 12 6 

Qy 131 GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 190 

I : I I I I I I I I I I I : I : I : I I I : I : I I I I I 

Db 127 RGDDCSSECAPGMWGPQCDKPCSCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPA 186 

Qy 191 CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 2 50 

I 1111:1111 I I I I I II I : I I I I I 1 I I I I I 

Db 187 CQFRCQC-HGAPCDPQTGACFCPAERTGPSCDVSCSQGTSGFFCPSTHPCQNGGVFQTPQ 245 

Qy 251 GECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDE 310 

I I I I I I I I I I : I I I I I I I I I I I I I : I I I I I I I I I I I I : I I I I I : I I :: I 
Db 246 GSCSCPPGWMGTICSLPCPEGFHGPNCSQECRCHNGGLCDRFTGQCRCAPGYTGDRCREE 305 

Qy 311 CPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPC 370 

I I I I : I I I I I I I : I : : I I I I I I I I I : I I II I I : I I I : I II 

Db 3 06 CPVGRFGQDCAETCDCAPDARCFPANGACLCEHGFTGDRCTDRLCPDGFYGLSCQAPCTC 3 65 

Qy 371 HLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTC 430 

I : : I I I I I : I i I : I I I I : I I : I I I : I : I I I : I I : I I : : I I I 

Db 3 66 DREHSLSCHPMNGECSCLPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGVCQATSGLCQC 42 5 

Qy 431 APGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCS PVDGSCTCKAGWHGVDCS IRCPSGT 490 

I I I : I I : : II I I I : I I I : I I I : I I I I : II I I I I I : I I : II I I 

Db 426 APGYTGPHCASLCPPDTYGVNCSARCSCENAIACSPIDGECVCKEGWQRGNCSVPCPPGT 485 

Qy 4 91 WGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGC 55 0 

II I I I : I I I : I : I I I I I I I I hill I : I I I I I I I I : I II 

Db 486 WGFSCNASCQCAHEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCDCDHSDGC 545 

Qy 551 HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTT 610 

I I I : I Ml I I I I I I I I I I I II I : I I :: I I II II I I I : 
Db 546 DPVHGRCQCQAGWMGARCHLSCPEGLWGVNCSNTCTCKNGGTCLPENGNCVCAPGFRGPS 605 

Qy 611 CQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCA 670 

I I I I I I I I I I : I 

Db 606 CQRSCQPGRYGKR CVP 621 

Qy 671 GICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIPITCNCHNGAFCSAYDGE 730 

I I I : I I : I 

Db 622 CKCANHSFCHPSNGT 636 

Qy 731 CKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCP 790 

II I I I I I : I I I I I I : I : I : III I I II 

Db 637 CYCLAGWTGPDCSQRCPLGTFGANCSQPCQCGPGEKC HPE 67 6 

Qy 791 SGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALP 850 

I I I I I I I I I : | : I : I 

Db 677 TGACVCPPGHSGAPCR IG IQEPFTVMP 703 

Qy 851 AD--SY-QIGAIAGIIILVLWLFLLALFIIYRHKQKGKESSMPAVTYTPAMRWNADYT 907 

: I : II : I I : I : I : I : I I I I I I I I I I II III: I : : : I 

Db 704 TTPVAYNSLGAVIGIAVLGSLWALVALFIGYRHWQKGKEHHHLAVAYSSG-RLDGSEYV 762 

Qy 908 ISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNL-KNVN 966 

: I : I I I : : I I I I I I I : I I : : I I : I I : I I 

Db 763 MPDVPP SYSHYYSNPSYHTLSQCSPNPPPPNK VPGPLFASLQKPER 808 



Qy 967 PGKRGPVG-DCTGTLPADWKH GGYLNELGAFGLDRSYMGKSL KDLGK 1012 

I I ! I I MINIM 11: I : I II I I II 

Db 809 PG — GAQGHDNHTTLPADWKHRREPPPGPLDR-GSSRLDRSYSYSYSNGPGPFYNKGLIS 865 

Qy 1013 NSEYNSSNCSLSSSENPYATIKDPPVLIPKSSECGYVEMKSP ARR 1057 

I : I I I M I M I I M : I I I I I : M I I M 

Db 866 EEELGASVASL-SSENPYATIRDLPSLPGGPRESSYMEMKGPPSGSPPRQPPQFWDSQRR 924 

Qy 1058 DSPYAEINNSTSANRNVYEVEPTVSWQGVFSNNGRLSQDP YDLPKNSHIP 1108 

I : : : I II : I : : : III I I I M I I I I 

Db 925 RQPQPQRDSGT YE-QPSPL IHDRDSVGSQPPLPPGLPPGHYDSPKNSHIP 973 

Qy 1109 CHYDLLPVRDSSSSP-KQED 1127 

I M I II I I I : : : I 

Db 974 GH YDL P PVRH P P S P PLRRQD 993 



RESULT 8 


AAB66269 


X D 


Z\ AR(n 69 Q ^fflnHa rH • nrnfpi n 1 fi'-ifi AA 

r\t\Lj U Utl U oL-dllv_4.ctJ_ vJ. f ^ LU LCJ.ll / u J u nn • 


XX 




AC 


AAR662 6 9 ; 


XX 




DT 


05-APR-2001 (first entry) 


YY 






Rat- TANGO 272 SEO ID NO: 20. 


XX 




KW 


Membrane associated protein; secreted protein; human; mouse; rat; 


KW 


INTERCEPT 340; MANGO 003; MANGO 347; TANGO 272; TANGO 295; TANGO 354; 


KW 


TANGO 378; skeletal disorder; cardiovascular disorder; renal disorder; 


KW 


haematopoietic disorder; neural disorder; hepatic disorder; 


KW 


neoplastic disease. 


XX 




OS 


Rattus sp . 


XX 




PN 


WO200100673-A1. 


XX 




PD 


04-JAN-2001. 


XX 




PF 


29-JUN-2000; 2 000WO-US01819 8 . 


XX 




PR 


3 0-JUN-19 99 ; 9 9US-0034 54 64 . 


XX 




PA 


(MILL-) MILLENNIUM PHARM INC. 


XX 




PI 


Barnes TM, Fraser CC, Wrighton N, Myers P, Busfield SJ, Sharp JD; 


XX 




DR 


WPI; 2001-050128/06. 


DR 


N-PSDB; AAF27791. 


XX 




PT 


Isolated secreted or transmembrane proteins are used for diagnosis and 


PT 


treatment of neoplastic and hematopoietic disorders e.g. T cell 


PT 


disorders, cancer and tumors. 


XX 




PS 


Claim 9; Page 238-240; 294pp; English. 


XX 





CC The present invention provides the protein and coding sequences for a 

CC number of membrane associated and secreted proteins from human, mouse and 

CC rat. The proteins are designated INTERCEPT 340, MANGO 003, MANGO 347, 

CC TANGO 272, TANGO 295, TANGO 254 and TANGO 378. The proteins are all 

CC involved in signal transduction and the sequences can be used in the 

CC treatment of cardiovascular, renal, hepatic, neural, neoplastic, skeletal 

CC and haematopoietic disorders 

XX 

SQ Sequence 636 AA; 

Query Match 28.3%; Score 1909; DB 4; Length 636; 

Best Local Similarity 45.1%; Pred. No. 1.6e-86; 

Matches 328; Conservative 77; Mismatches 212; Indels 110; Gaps 9 

Qy 260 MGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVL 319 

I I : I I I I I I I I I : I I I : I I I I I I I II I I I I : M I I : I I : : I I I I I : I 
Db 1 MGVICSLPCPEGFHGPNCTQECRCHNGGLCDRFTGQCHCAPGYIGDRCREECPVGRFGQD 60 

Qy 32 0 CAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCH 37 9 

I I I I I I I : I : : I I I I I I I I I : I I I I I I : I I I : I II I : : I I I 
Db 61 CAETCDCAPGARCFPANGACLCEHGFTGDRCTERLCPDGRYGLSCQDPCTCDPEHSLSCH 12 0 

Qy 380 PMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDC 439 

I I I I I : I : I I I : I I : I I I : I : I I I : I I : I I : : i I I I I I : I I 

Db 121 PMHGECSCQPGWAGLHCNESCPQDTHGAGCQEHCLCLHGGVCLADSGLCRCAPGYTGPHC 180 

Qy 440 STPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTC 499 

: II I I I I I I I I I I : I 111111:11111 : I I : I I I I I I I I I : I 
Db 181 ANLCPPNTYGINCSSHCSCENAIACSPVDGTCICKEGWQRGNCSVPCPPGTWGFSCNASC 240 

Qy 500 QCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRC 559 

I I : I I : I I I I I I I I I I : I I I I : I II I Ml I I I I I 

Db 241 QCAHEGVCSPQTGACTCTPGWRGVHCQLPCPKGQFGEGCASVCDCDHSDGCDPVHGHCRC 30 0 

Qy 560 LPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCS PDDGICECAPGFRGTTCQRICSPGF 619 

III I I I I I I I I I I I I I I : I I :: I I I I I I I I I : I I I I I I 

Db 301 QAGWMGTRCHLPCPEGFWGANCSNACTCKNGGTCVPENGNCVCAPGFRGPSCQRPCPPGR 360 

Qy 62 0 YGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNG 67 9 

III 1:1 III: 
Db 361 YGKR CVP CKCNNHS 374 

Qy 680 TCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTG 739 

: I : I I : I I I I I I I I : I I I I I I I I I I : I I I I I I I I I I I I 

Db 375 SCHPSDGTCSCLAGWTGPDCSESCPPGHWGLKCSQPCQCHHGATCHPQDGSCVCIPGWTG 434 

Qy 740 LYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCR 799 

I : : I I : I : I : : I I I I I II 
Db 435 PNCSEGCPSRMFGVNCSQLCQCDPGEMC HPE 465 

Qy 800 QICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALPADSYQIGAI 859 

I I I I I I II I I I : I : I : I : M : 

Db 4 66 TGACVCPPGHSGAHCK VGSQESFTIMPTS-PV1HNSLGAV 504 

Qy 860 AG III LVLWL FL L AL FIIYRHKQKGKES S MP AVT YT P AMRWN AD YT I S GT L P H S N G GN 919 

I I : I : I : I : I I I I I I I I I I I I III: I : : I I : I 
Db 505 IGIAVLGTLWALVALFIGYRHWQKGKEHEHLAVAYSTG-RLDGSDYVMPDVSP 557 



Qy 920 ANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKNVNPGKRGPVGDCTGT 979 

: I I I : : I I I I i I I : I I : : I I I : I I I I : : I I I 

Db 558 SYSHYYSNPSYHTLSQCSPNPPPPN KIPGSQLFVSSQASERPNRNHGRDNHAT 610 

Qy 980 LPADWKH 98 6 

I I I I I 1 I 

Db 611 LPADWKH 617 



RESULT 9 
ADA21141 

ID ADA21141 standard; protein; 1350 AA. 
XX 

AC ADA21141; 
XX 

DT 20-NOV-2003 (first entry) 
XX 

DE Human secreted protein SECP-46 SEQ ID NO: 46. 
XX 

KW human; secreted protein; SECP; anti-HIV; antiallergic; antiinflammatory; 

KW antianaemic; antiparkinsonian; nootropic; anticonvulsant; 

KW antiarteriosclerotic; antiasthmatic; immunosuppressive; antithyroid; 

KW cytostatic; hepatotropic; dermatological ; antidiabetic; nephrotropic; 

KW antigout; thyromimetic ; neuroprotective; osteopathic; antiarthritic; 

KW antiparasitic; antihelminthic; antipsoriatic ; uropathic; 

KW ophthalmological ; antirheumatic; haemostatic; antibacterial; virucide; 

KW protozoacide; fungicide; gene therapy; cell proliferative disorder; 

KW arteriosclerosis; atherosclerosis; cirrhosis; hepatitis; 

KW paroxysmal nocturnal haemoglobinuria ; polycythaemia vera; psoriasis; 

KW primary thrombocytopaenia ; cancer; developmental disorder; 

KW renal tubular acidosis; anaemia; mental retardation; 

KW neurological disorder; Alzheimer's disease; Parkinson's disease; 

KW epilepsy; autoimmune disorder; inflammatory disorder; AIDS; allergy; 

KW asthma; autoimmune thyroiditis; contact dermatitis; Crohn's disease; 

KW diabetes mellitus; glomerulonephritis; Goodpasture's syndrome; gout; 

KW Graves' disease; Hashimoto's thyroiditis; irritable bowel syndrome; 

KW multiple sclerosis; osteoarthritis; osteoporosis; pancreatitis; 

KW Reiter's syndrome; rheumatoid arthritis; Sjogren's syndrome; uveitis; 

KW infection. 

XX 

OS Homo sapiens. 
XX 

PN WO2003068943-A2 . 
XX 

PD 21-AUG-2003. 
XX 

PF 13-FEB-2003; 2 003WO-US0047 12 . 
XX 

PR 13-FEB-2002; 2002US-0357002P . 

PR 06-MAR-2002; 2 002US-0362439P . 

PR 19-MAR-2002; 2 0 02US- 03 6604 IP . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Lehr-Mason PM, Kable AE, Elliott VS , Marquis JP, Baughn MR; 

PI Chawla NK, Tran UK, Jin P, Tang YT, Zebarjadian Y, Swarnakar A; 



PI Hafalia AJA, Cocks BG, Warren BA, Emerling BM, Pearson CI, Chien D; 

PI Peterson DP, Fu GK, Yue H, Jackson AA, Jiang X, Hawkins PR, Lai PG; 

PI Khare R, Lee S, Lee SY, Richardson TW, Chang H; 
XX 

DR WPI; 2003-689669/65. 

DR N-PSDB; ADA21192. 
XX 

PT New human secreted proteins and polynucleotides, useful for diagnosing, 

PT treating or preventing autoimmune or inflammatory disorders (e.g. AIDS, 

PT allergy, asthma or anemia), multiple sclerosis, osteoporosis, cancer or 

PT hepatitis. 
XX 

PS Claim 1; Page 249-252; 295pp; English. 
XX 

CC The present sequence represents a human secreted protein (I) designated 

CC SECP-46. (I) have anti-HIV, antiallergic, antiinflammatory, antianaemic, 

CC antiparkinsonian, nootropic, anticonvulsant, antiarteriosclerotic, 

CC antiasthmatic, immunosuppressive, antithyroid, cytostatic, hepatotropic, 

CC dermatological, antidiabetic, nephrotropic, antigout, thyromimetic, 

CC neuroprotective, osteopathic, antiarthritic, antiparasitic, 

CC antihelminthic, antipsoriatic, uropathic, ophthalmological , 

CC antirheumatic, haemostatic, antibacterial, virucide, protozoacide and 

CC fungicide activities, and can be used in gene therapy. The human secreted 

CC proteins (SECP), polynucleotides, agonists and antagonists of the present 

CC invention are useful for diagnosing, treating or preventing disorders 

CC associated with aberrant expression of SECP, particularly cell 

CC proliferative disorders (e.g. arteriosclerosis, atherosclerosis, 

CC cirrhosis, hepatitis, paroxysmal nocturnal haemoglobinuria, polycythaemia 

CC vera, psoriasis, primary thrombocytopaenia or cancer), developmental 

CC disorders (e.g. renal tubular acidosis, anaemia or mental retardation), 

CC neurological disorders (e.g. Alzheimer's disease, Parkinson's disease or 

CC epilepsy), autoimmune/inflammatory disorders (e.g. AIDS, allergies, 

CC asthma, autoimmune thyroiditis, contact dermatitis, Crohn's disease, 

CC diabetes mellitus, glomerulonephritis, Goodpasture's syndrome, gout, 

CC Graves' disease, Hashimoto's thyroiditis, irritable bowel syndrome, 

CC multiple sclerosis, osteoarthritis, osteoporosis, pancreatitis, Reiter's 

CC syndrome, rheumatoid arthritis, Sjogren's syndrome, uveitis), or viral, 

CC bacterial, fungal, parasitic, protozoan or helminthic infections. The 

CC SECP and polynucleotides are also useful in assessing the effects of 

CC exogenous compounds on the expression of nucleic acids secreted proteins. 

CC The polynucleotides encoding SECP are useful for creating transgenic 

CC animals to model human disease. 

XX 

SQ Sequence 1350 AA; 

Query Match 28.1%; Score 1897; DB 6; Length 1350; 

Best Local Similarity 40.4%; Pred. No. 1.3e-85; 

Matches 334; Conservative 90; Mismatches 325; Indels 78; Gaps 15 

Qy 94 CCPGFYESGEMC VPHCADKCVHGRC I APNT CQCEPGWGGTNC 135 

I I I I I I I I : : : I : I : I : I | | : | : I I 

Db 523 CDPGLY--GRFCHLTCPPWAFGPGCSEEC QCVQPHTQSCDKRDGSCSCKAGFRGERC 577 

Qy 136 SSACDGDHWGPHCTSRCQCKNGALCNPITGAC— HCAAGFRGWRCEDRCEQGTYGNDCHQ 193 

: I : : : I I I II I I : : : I I I I i I : I I I I I : I : I 

Db 57 8 QAECELGYFGPGCWQACTCPVGVACDSVSGECGKRCPAGFQGEDCGQECPVGTFGVNCSS 637 



Qy 194 RCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCP-CQNGGVCHHVTGE 252 

II III I I I : I I I I II I I II I I I : I I : : I I I I : I I I 
Db 63 8 SCSC-GGAPCHGVTGQCRCPPGRTGEDCEADCPEGRWGLGCQEICPACQHAARCDPETGA 69 6 

Qy 253 CSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECP 312 

II I : : I : I I I I : I : I I I I I I I I I I I : I i : I I II I 

Db 697 CLCLPGFVGSRCQDVCPAGWYGPSCQTRCSCANDGHCHPATGHCSCAPGWTGFSCQRACD 756 

Qy 313 VGTYGVLCAETCQCVNG-GKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCH 371 

I : I I : II III : I I I I I I I I : I I I I : I I : I : I I : : I I 
Db 757 TGHWGPDCSHPCNCSAGHGSCDAI SGLCLCEAGYVGPRCEQQ-CPQGHFGPGCEQLCQC- 814 

Qy 372 LENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCA 431 

: : : I : I I I I III:] I I I : I I : I : I I I I I : I I I I 

Db 815 -QHGAACDHVSGACTCPAGWRGTFCEHACPAGFFGLDCRSACNCTAGAACDAVNGSCLCP 873 

Qy 432 PGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTW 491 

I :| I: II III III I I I I I II I I I II I I I 
Db 874 AGRRGPRCAETCPAHTYGHNCSQACACFNGASCDPVHGQCHCAPGWMGPSCLQECLPRDV 933 

Qy 492 GFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCH 551 

II : I I I I I I : III i 1 I : I I : I I I : I I I : I I I II 
Db 934 RAGCRHSGGCLNGGLCDPHTGRCLCPAGWTGDKCQSPCLRGWFGEACAQRCSCPPGAACH 993 

Qy 552 PTTGHCRCLPGWSGVHCDSVCAEGRWGPNCS LPCYCK-NGASCSPDDGICECAPGFRGTT 610 

I I I I I I I :: I I : I I : I : I : II : I I I I I I I : I : 

Db 9 94 HVTGACRCPPGFTGSGCEQACPPGSFGEDCAQMCQCPGENPACHPATGTCSCAAGYHGPS 1053 

Qy 611 CQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCA 670 

11:11111 III I : : I I II I I III II II I I I I I I 

Db 1054 CQQRCPPGRYGPGCEQLC-GCL-NGGSCDAATGACRCPTGFLGTDCNLTCPQGRFGPNCT 1111 

Qy 671 GICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGE 730 

: I I I : I : : I I I I I I : I I : I I I I I : I I I I I : I 

Db 1112 HVCGCGQGAACDPVTGTCLCPPGRAGVRCERGCPQNRFGVGCEHTCSCRNGGLCHASNGS 1171 

Qy 731 CKCTPGWTGLYCTQRCPLGFYGKDCAL 757 

I I I I I I : I I I I I I I I 
Db 1172 CSCGLGWTGRHCELACPPGRYGAACHLECSCHNNSTCEPATGTCRCGPGFYGQACEHPCP 1231 

Qy 758 ICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNN 807 

: I I I : I I I I I II : I I III I I : I I — I II I II I 
Db 1232 PGFHGAGCQGLCWCQHGAPCDPI S GRCLCPAGFHGHFCERGCEPGS FGEGCHQRCDCDGG 1291 

Qy 808 STCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTST-ALPADS 853 

: I I : I I I 1 I I III: I : : I I : : I I I I I 

Db 1292 APCDPVTGLCLCPPGRS GATCNLGGPLRLPENP SLAQGSAGTLPAS S 1338 



RESULT 10 
ABJ37904 

ID ABJ37904 standard; protein; 1577 AA. 
XX 

AC ABJ37904; 
XX 

DT 22-MAY-2003 (first entry) 
XX 



DE 


NOVX protein sequence SEQ ID No 54. 


XX 






KW 


Hepatotropic; inrtmuno suppressive ; cardiant ; hypertensive; tranquilizer; 


KW 


vulnerary; virucide; antibacterial; protozoacide ; fungicide; nootropics- 


KW 


antiparasitic; neuroprotective; cerebroprotective ; antiparkinsonian; 


KW 


anticonvulsant ; antiaddictive ; analgesic; dermatological ; keratolytic; 


KW 


ant i seborrheic; antirheumatic; antiarthri tic ; antiinflammatory; ant i -HIV; 


KW 


cytostatic- 


antiasthmatic; antipsoriatic ; hypotensive; osteopathic; 


KW 


antiulcer; anorectic; antidiabetic; antiallergic; haemostatic; 


KW 


neuroleptic; 


antidepressant; antiinf ertility ; NOVX; human disease; 


KW 


NOVX-associated disorder; trauma; viral; bacterial; fungal; protozoal; 


KW 


parasitic infection; Alzheimer's disease; stroke; forensic biology; 


KW 


immunogen; non-human transgenic animal; gene therapy. 


XX 






OS 


Unidentified. 


XX 






PN 


WUZ UUZolOl i - 


ft 9 
AZ . 


YY 
A./V 






pn 


1 /— OU1ZUUZ . 




YY 
AA 






PF 


oo t7\"nt onm i 
ZZ _ J AIM — Z UUz , 




YY 






PR 


i y — J AN — Z UUI, 


9 fl 0 1 TTC!-_n9^9PQ9P 
ZUU1U j UZOZO^Zi: . 


PR 


Z O J AIM Z UU1, 


iUUIUj U6DJJ"Or . 


PR 


J AIM z u U X / 


90D1 TT^— D9 fsll QQP 


PR 


Z 0 0 AIM Z U U 1 , 


9finiTT^— H9(^41 1 7P 


PR 

IT x\ 


Z 0 U AIM ZUU1 , 




PR 


Z O U AIN ZUUI/ 


9 0 D 1 TT^— 0964478P 


PR 


■3D- jaw- 9 n m 


onniTT c ;-n96335lP 

i U U 1 U J Uil. U-JO'JiL • 


PR 


D9 — MAR— 9 DDI 


9001TTS-0?7?R70P 


PR 


14-MAR-2001, 


2001US-0275927P. 


PR 


14-MAR-2001, 


2001US-0275990P. 


PR 


15-MAR-2001, 


2001US-0276449P. 


PR 


20-MAR-2 0 01, 


2001US-0277358P. 


PR 


23-MAR-2 001, 


• 2001US-0278151P. 


PR 

XT r\ 


2 9-MAR-2001 


• 2001US-0279857P. 


PR 


20-APR-2001 


; 2001US-0285140P. 


PR 


20-APR-2001 


: 2001US-0285141P. 


PR 


30-APR-2001 


? 2001US-0287484P. 


PR 


17-MAY-2001 


; 2001US-0291701P. 


PR 


08-JUN-2001 


; 2001US-0296960P. 


PR 


10-JUL-2001 


; 2001US-0304353P. 


PR 


10-JUL-2001 


; 2001US-0304355P. 


PR 


12-JUL-2001 


; 2001US-0304886P. 


PR 


09-AUG-2001 


; 2001US-0311289P. 


PR 


13-AUG-2001 


; 2001US-0311975P. 


PR 


16-AUG-2001 


; 2001US-0312937P. 


PR 


18-OCT-2001 


; 2001US-0330227P. 


PR 


29-NOV-2001 


; 2001US-0334198P. 


XX 






PA 


(CURA-) CURAGEN CORP. 


XX 






PI 


Decristofaro MF, Padigaru M, Miller C, Tchernev V, Zhong H; 


PI 


Zhong M, Anderson D, Ballinger R, Gerlach V, Spytek KA, Rastelli L; 


PI 


Kekuda R, Guo X, Zerhusen B, Andrew D, Mezes P, Patturajan M; 


PI 


Burgess CE, 


Eisen A, Wolenc A, Baumgartner J, Shimkets RA, Gusev V; 


PI 


Vernet CAM, 


Taupier RJ, Pena C, Shenoy S, Li L, Casman S, Boldog F; 



PI Fernandes E, Smithson G, Malyankar U, Taillon B, Liu X; 
XX 

DR WPI; 2003-058504/05. 

DR N-PSDB; ABT33369. 
XX 

PT New polypeptides, designated as NOVX, useful for diagnosing and treating 

PT infections, neurological diseases, cancer, allergy, and bone, 

PT immunological, skin, renal, brain, muscle and autoimmune disorders. 

XX 

PS Claim 1; Page 135-136; 672pp; English. 

XX 

CC The invention relates to a novel isolated polypeptide, designated NOVX 

CC (NOV1 - 33), consisting of a mature form of one of 61 sequences, given in 

CC the specification, or its variant, where amino acid residue (s) in the 

CC variant differ from the mature form, provided that the variant differs in 

CC not more than 15 % of the amino acids from the sequence of the mature 

CC form. The NOVX polypeptides, nucleic acids encoding the polypeptides, and 

CC an antibody to the polypeptides, are useful for treating or preventing a 

CC NOVX-associated disorder in humans and for treating a syndrome associated 

CC with a human disease (NOVX-associated disorder) . NOVX polypeptides and 

CC the encoding nucleic acids, are useful for determining the presence of or 

CC predisposition to a disease associated with altered levels of NOVX 

CC polypeptide and polynucleotide, by measuring the level of polypeptide 

CC expression or the amount of nucleic acid from a mammal and comparing it 

CC with another mammal not having or not predisposed to the disease. NOVX 

CC polypeptide is also useful for identifying an agent that binds to NOVX 

CC and a cell expressing NOVX is useful for identifying an agent that 

CC modulates the expression or activity of NOVX. The antibodies and a 

CC polypeptide having 95 % sequence identity to NOVX polypeptide are useful 

CC for treating a pathological state in a mammal. The antibodies are also 

CC useful for determining the presence or amount of NOVX in a sample. NOVX 

CC polypeptides, polynucleotides and antibodies specific for the 

CC polypeptides are useful for treating or preventing disorders or syndromes 

CC including trauma, viral, bacterial, fungal, protozoal, and parasitic 

CC infections. They can also treat disorders such as e.g., Alzheimer's 

CC disease or a stroke. The NOVX encoding nucleic acids are useful for 

CC expressing the NOVX proteins, to detect NOVX mRNA, or a genetic lesion in 

CC a NOVX gene and to modulate NOVX activity. NOVX sequences are also useful 

CC for identifying a cell or tissue type in a biological sample, to amplify 

CC DNA sequences from very small biological samples such as tissues e.g. 

CC hair or skin or body fluids in forensic biology and as primers and probes 

CC for use in identifying and/or cloning NOVX homologues in other cell 

CC types. The NOVX proteins are useful as an immunogen to generate 

CC antibodies which are useful for diagnos tically monitoring protein levels 

CC and modulating NOVX activity. Cells comprising NOVX nucleic acids are 

CC useful for producing non-human transgenic animals which are useful for 

CC studying the function and/or activity of NOVX protein and for identifying 

CC and/or evaluating modulators of NOVX protein activity. The NOVX nucleic 

CC acids can be used in gene therapy. This sequence represents a NOVX 

CC protein of the invention 

XX 

SQ Sequence 1577 AA; 

Query Match 27.9%; Score 1879; DB 6; Length 1577; 

Best Local Similarity 40.4%; Pred. No. l.le-84; 

Matches 325; Conservative 83; Mismatches 318; Indels 78; Gaps 15 



Qy 94 CCPGFYESGEMC VPHCADKCVHGRCIAPNT CQCEPGWGGTNC 135 

1 I I : : : I : I : I : I | | : | : | I 

Db 634 CDPGLY — GRFCHLACPPWAFGPGCSEEC QCVQPHTQSCDKRDGSCSCKAGFRGERC 68 8 

Qy 136 SSACDGDHWGPHCTSRCQCKNGALCNPITGAC — HCAAGFRGWRCEDRCEQGTYGNDCHQ 193 

: | : : : I I I II I I : : : I I I I I I : I I I I I : I : I 
Db 689 QAECEPGYFGPGCWQACTCPVGVACDSVSGECGKRCPAGFQGEDCGQECPVGTFGVNCSS 748 

Qy 194 RCQCQNGATCDHVTGECRCPPGYTGAFCE DLCP 226 

I I I I I I I I : I I I I I I I I I I 
Db 74 9 SCSC-GGAPCHGVTGQCRCPPGRTGEDCEAGECEGLWGLGCQEICPACHNAARCDPETGA 8 07 

Qy 227 PGKHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQPCPE 270 

I : I I 1 : I I i I I I I I I I I I I I I : I 
Db 808 CLCLPGFVGSRCQDCEAGWYGPSCQTMCSCANDGHCHQDTGHCSCAPGWTGFSCQRACDT 867 

Qy 271 GRFGKNCSQECQCHNG-GTCDAATGQCHCSPGYTGERC-QDECPVGTYGVLCAETCQCVN 328 

| : I : I I II 11:111:111 I I I I I I I I I I : I I : I I I : 
Db 868 GHWGPDCSHPCNCSAGHGSCDAISGLCLCEAGYVGPRCEQSECPQGHFGPGCEQRCQCQH 927 

Qy 32 9 GGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACK 38 8 

I 1111111111:1 I I I I I : I : I li : I : : I I I 

Db 928 GAACDHVSGACTCPAGWRGTFCE-HACPAGFFGLDCRSACNC--TAGAACDAWGSCLCP 984 

Qy 389 PGWSGLYCNETCSPGF-YGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGT 447 

I | ||: I II I I I : I 1 I I I I I I : I I I I I : I I I I I 
Db 985 AG RRG P RCAE SAC P AHT Y GHN C S Q ACAC FN GAS C D P VH GQ C H CAP GWMG P S CL Q AC PAG L 1044 

Qy 448 YGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGAC 507 

I I I I I I : I I I I I I I I I I : I : I II : I I I I I I 

Db 1045 YGDNCRHSCLCQNGGTCDPVSGHCACPEGWAGLACEVECLPRDVRAGCRHSGGCLNGGLC 1104 

Qy 508 NTLDGTCTCAPGWRGEKCELP--CQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSG 565 

: Ml I I I : I I : I I I I : I : I III I I I I I I I I I I 

Db 1105 DPHTGRCLCPAGWTGDKCQSPAACAKGTFGPHCEGRCACRWGGPCHLATGACLCPPGWRG 1164 

Qy 566 VHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCS 625 

| : | i : I I : II I I : I I I I I I I I : I : : I M : I I = 

Db 1165 PHLSAACLRGWFGEACAQRCSCPPGAACHHVTGACRCPPGFTGSGCEQACPPGSFGEDCA 1224 

Qy 62 6 QTCPQCVHSSGPCHHITGLCDCLPGFTGALCMEVCPSGRFGKNCAGICTCTNNGTCNPID 685 

III! : II I I I I I : I I : I I I I : I I : I I I I : I = 

Db 1225 QMC-QCPGENPACHPATGTCSCAAGYHGPSCQQRCPPGRYGPGCEQLCGCLNGGSCDAAT 1283 

Qy 686 RSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQR 745 

: I : I I : : I : II : II : I I I I I I I III I I I I I I : I : 

Db 1284 GACRCPTGFLGTDCNLTCPQGRFGPNCTHVCGCGQGAACDPVTGTCLCPPGRAGVRCERG 1343 

Qy 746 CPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCL 805 

I | : I I I I : I I I : I I : I I : I I I I I I I I I I I I I 
Db 1344 CPQNRFGVGCEHTCSCRNGGLCHASNGSCSCGLGWTGRHCELACPPGRYGAACHLECSCH 1403 

Qy 806 NNSTCDHITGTCYCSPGWKGARCD 829 

Mill: I 1 I I I I I : I I : 

Db 1404 NNSTCEPATGTCRCGPGFYGQACE 1427 



RESULT 11 
ADD78227 

ID ADD78227 standard; protein; 1261 AA. 
XX 

AC ADD78227; 
XX 

DT 29-JAN-2004 (first entry) 
XX 

DE Human CGDD-8 . 
XX 

KW Anabolic; Hypertensive; Respiratory; Anti-HIV; Antiallergic; 

KW Neuroprotective; Nootropic; Antianemic; Antiarteriosclerotic ; 

KW Antiinflammatory; Opthalmological ; Muscular; Hepatotropic; 

KW Neuroprotective; Antiasthmatic; Anticonvulsant; Virucide; Antibacterial; 

KW Fungicide; Antiparasitic; Protozoacide ; Antihelminthic; Cytostatic; 

KW Cerebroprotective; Antiparkinsonian; Antipsoriatic; Antigout; 

KW Antidiabetic; Antiarthritic; Antirheumatic; Osteopathic; Gene therapy; 

KW human; cell growth; cell differentiation; cell death; CGDD; 

KW cell proliferative disorder; cancer; developmental disorder; 

KW neurological disorder; autoimmune disorder; inflammatory disorder; 

KW infection; reproductive disorder. 

XX 

OS Homo sapiens. 
XX 

PN WO2003077875-A2 . 
XX 

PD 25-SEP-2003. 
XX 

PF 14-MAR-2003; 2 0 03WO-US 00 8 31 0 . 
XX 

PR 15-MAR-2002; 2 0 02US- 03 64 4 94 P . 

PR 29-MAR-2002; 2 002US-0369 129P . 

PR 12-APR-2002; 2 002US-0372511P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Kable AE, Tran UK, Hafalia AJA, Burford N, Honchell CD; 

PI Lehr-Mason PM, Duggan BM, Ramkumar J, Griffin JA, Richardson TW; 

PI Elliott VS, Jiang X, Jackson AA, Marquis JP, Chawla NK, Khare R; 

PI Becha SD, Lee SY, Swarnakar A, Yue H, Warren BA, Baughn MR, Lai PG; 

PI Lee S, Ho A, Gandhi AR, Yao MG; 

XX 

DR WPI; 2003-779081/73. 

DR N-PSDB; ADD78266. 
XX 

PT New polypeptides and polynucleotides associated with cell growth, 

PT differentiation and death, useful for diagnosing, treating or preventing 

PT e.g. developmental, neurological, autoimmune, inflammatory or 

PT reproductive disorders. 

XX 

PS Claim 1; SEQ ID NO 8; 320pp; English. 
XX 

CC The present invention relates to novel human proteins (I; ADD78220- 

CC ADD78258) and their coding sequences (II; ADD7 82 5 9-ADD7 82 97 ) , which are 

CC associated with cell growth, differentiation and death, referred to as 

CC CGDD-n proteins, where n is a number from 1 to 39. The CGDD proteins and 

CC their coding sequences are useful for diagnosing, treating or preventing 



CC cell proliferative disorders (e.g. cirrhosis, hepatitis, 

CC arteriosclerosis, psoriasis, primary thrombocytopenia) or cancers (e.g. 

CC adenocarcinoma, sarcoma or cancers of the bone, bone marrow, brain, 

CC breast, colon, kidney, liver, lung or uterus), developmental disorders 

CC (e.g. renal tubular acidosis, Becker muscular dystrophy, gonadal 

CC dysgenesis, hypothyroidism or seizures), neurological disorders (e.g. 

CC Pick's disease, cataract, epilepsy, ischemic cerebrovascular disease, 

CC stroke, Alzheimer's disease, Parkinson's disease or dementia), 

CC autoimmune/inflammatory disorders (e.g. AIDS, allergies, anemia, asthma, 

CC diabetes mellitus, bronchitis, osteoporosis, osteoarthritis, rheumatoid 

CC arthritis, contact dermatitis or gout) , viral, bacterial, fungal, 

CC parasitic, protozoan or helminthic infections, reproductive disorders 

CC (e.g. infertility, ectopic pregnancy, premature ovarian failure, delayed 

CC puberty or prostatitis) or disorders of the placenta (e.g. preeclampsia, 

CC choriocarcinoma, placenta previa, placental or maternal floor infarction 

CC or chronic villitis) . 

XX 

SQ Sequence 1261 AA; 

Query Match 27.8%; Score 1874.5; DB 7; Length 1261; 

Best Local Similarity 41.3%; Pred. No. 1.5e-84; 

Matches 328; Conservative 66; Mismatches 344; Indels 57; Gaps 10; 

Qy 8 3 GEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRC-IAPNTCQCEPGWGGTNCSSACDG 141 

Ml: II : I : : I I I I I I : ! 

Db 396 GEHTL-TEKFVCLDDSF--GHDCSLTCDDCRNGGTCLLGLDGCDCPEGWTGLICNETCPP 452 

Qy 142 DHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGA 201 

| : | : I : I I : I I I : : I I I I I I I I I I I : I I I I : : I I I 

D b 453 DTFGKNCSFSCSCQNGGTCDSVTGACRCPPGVSGTNCEDGCPKGYYGKHCRKKCNCANRG 512 



Qy 



2 02 TCDHVTGECRCPPGYTGAFCEDLCPP 227 

I : I I I I I III IN 
Db 513 RCHRLYGACLCDPGLYGRFCHLTCPPWAFGPGCSEECQCVQPHTQSCDKRDGSCSCKAGF 572 



Q y 228 GKHGPQCEQRCPCQNGGVCHHWGECS--CPSGWMGTVCGQPCPEGRFG 274 

I I I I I I I I I hill I I : i : I I II I I I I I 

Db 573 RGERCQAECELGYFGPGCWQACTCPVGVACDSVSGECGKRCPAGFQGEDCGQECPVGTFG 632 

Qy 275 KNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYH 334 

Ml || I I I I I I I M I I I I I I I : I I : I I I : I II 
Db 633 VNCSSSCSC-GGAPCHGVTGQCRCPPGRTGEDCEADCPEGHFGPGCEQRCQCQHGAACDH 691 

Qy 335 VSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGL 394 

I I I I I I I I : I II | | I : I : I II : I : : I I I I I 

Db 692 VSGACTCPAGWRGTFCE-HACPAGFFGLDCRSACNC-- TAGAACDAVNGSCLCPAGRRGP 74 8 

Qy 395 YCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSS 454 

| | | | M | | I : I I I I I I I I : I I I M : I I I I I I I I I 

Db 74 9 RCAETCPAliTYGHNCSQACACFNGASCDPVHGQCHCAPGWMGPSCLQACPAGLYGDNCRH 8 08 

Qy 455 RCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTC 514 

I I : I I II I I I I I I : I I M : MINI: II 

Db 809 SCLCQNGGTCDPVSGHCACPEGWAGLACEKECLPRDVRAGCRHSGGCLNGGLCDPHTGRC 868 

Qy 515 TCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAE 574 

I | | | : I I : I I I : I I I : I I I II I I I I I I I I I : I 



Db 


869 


LCPAGWTGDKCQSPCLRGWFGEACAQRCSCPPGAACHHVlbAuRCPP&FIGbGChjyGOrt' 


z> iL O 


Qy 


575 


GRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHS 

||:[| II 1 1 1 1 1 1 1 : 1 1 : 1 1 

GRYGPGCEQLCGCLNGGSCDAATGACRCPTGFLGTDCNLTCPQGRFGPNCTHVC-GCGQG 


£ Q A 
D O H 


Db 


929 


yd/ 


Qy 


635 


SGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGW 

: | : | | 1 1 1 1 1 1 II III 1 1 : I 1 1 1 : : 1 1 1 II 

AA-CDPVTGTCLCPPGRAGVRCERGCPQNRFGVGCEHTCSCRNGGLCHASNGSCSCGLGW 




Db 


988 


IU4b 


Qy 


695 


IGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKD 

| | Ml : 1 1 1 : 1 1 1 : 1 1 1 : 1 1 1 : 1 1 1 1 1 1 : 1 

TGRHCELACPPGRYGAACHLECSCHNNS I GhPAl G ILROGPGh YGQALhhLPCPPCji? nGAb 


/ 0 4 


Db 


1047 


11UD 


Qy 


755 


CALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHIT 

I : | M : 1 1 1 1 1 1 1 : 1 1 III 1 1 : 1 1 : : 1 1 1 1 1 1 1 : 1 1 : 1 

CQGLCWCQHGAPCDPISGRCLCPAGFHGHFCERGCEPGSFGEGCHQRCDCDGGAPCDPVT 


Q '1 A 
O I 4 


Db 


1107 


1166 


Qy 


815 


GTCYCSPGWKGARCD 82 9 

1 1 1 1 1 III: 

GLCLCPPGRSGATCN 1181 




Db 


1167 





RESULT 12 




ABP75770 




T Pi 


ABP75770 standard; protein; 349 AA. 


vv 






AC 


ABP75770; 




XX 






DT 


10-FEB-2003 


(first entry) 


XX 






DE 


Human secretory polypeptide SPTM SEQ ID NO 954. 


XX 






KW 


Human; SPTM; 


autoimmune disorder; inflammatory disorder; AIDS; anaemia; 


KW 


asthma; Crohn 


's disease; neurological disorder; epilepsy; cancer; 


KW 


Huntington ' s 


disease; Alzheimer's disease; Creutzfeldt- Jakob disease; 


KW 


multiple sclerosis; Parkinson's disease; cell proliferative disorder; 


KW 


anti-inflammatory; immunosuppressive; neuroprotective ; nootropics- 


KW 


neuroleptic; 


anticonvulsant; cytostatic; antiparkinsonian; anxiolytic; 


KW 


antipsoriatic 


; antianaemic; anti-HIV; human immunodeficiency virus; 


KW 


secretory polynucleotide; secretory protein. 


XX 






OS 


Homo sapiens . 




XX 






PN 


WO200283876-A2 . 


XX 






PD 


24-OCT-2002 . 




XX 






PF 


27-MAR-2002; 


2002WO-US009921. 


XX 






PR 


29-MAR-2001; 


2001US-0280067P. 


PR 


29-MAR-2001; 


2001US-0280068P. 


PR 


16-MAY-2001; 


2001US-0291280P. 


PR 


17-MAY-2001; 


2001US-0291829P. 


PR 


17-MAY-2001; 


2001US-0291849P. 


PR 


19-JUN-2001; 


2001US-0299428P. 


PR 


20-JUN-2001; 


2001US-0299776P. 



PR 20-JUN-2001; 2001US-0300001P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Daffo A, Jones AL, Tran AB, Dahl CR, Gietzen D, Chinn J; 

PI Dufour GE, Hillman JL, Yu JY, Tuason O, Yap PE, Amshey SR; 

PI Daughtery SC, Darn TC, Liu TF, Nguyen DA, Kleefeld Y, Gerstin EH; 

PI Peralta CH, David MH, Lewis SA, Chen A J, Panzer SR, Harris B; 

PI Flores V, Marwaha R, Lo A, Lan RY, Urashka ME; 

XX 

DR WPI; 2003-075543/07. 

DR N-PSDB; ABZ36212. 
XX 

PT New human secretory proteins and polynucleotides, useful for diagnosing, 

PT treating or preventing auto immune/ inflammatory disorders (e.g. AIDS), 

PT neurological disorders (e.g. Alzheimer's), or cell proliferations or 

PT cancers. 
XX 

PS Claim 27; SEQ ID NO 954; 458pp + Sequence Listing; English. 
XX 

CC The invention relates to a secretory polynucleotide (designated sptm) 

CC comprising any of 567 polynucleotide sequences ( ABZ35837-ABZ364 03 ) , a 

CC naturally occurring polynucleotide sequence at least 90 % identical to 

CC the polynucleotide sequence, a polynucleotide complementary to them or an 

CC RNA equivalent of them. The polypeptide or polynucleotide are useful for 

CC treating, preventing or diagnosing a disease or condition associated with 

CC the expression of functional SPTM. These are particularly useful for 

CC diagnosing, treating or preventing autoimmune/inflammatory disorders 

CC (e.g. acquired immunodeficiency syndrome, anaemia, asthma or Crohn's 

CC disease), neurological disorders (e.g. epilepsy, Huntington's disease, 

CC dementia, stroke, Alzheimer's disease, Creutzfeldt- Jakob disease, 

CC multiple sclerosis, cerebral palsy, Parkinson's disease, anxiety, 

CC schizophrenia or amnesia), or cell proliferative disorders (e.g. 

CC psoriasis, polycythemia vera, or cancers including adenocarcinoma, 

CC leukaemia, lymphoma, melanoma, myeloma, sarcoma or cancers of the brain, 

CC breast, cervix or prostate) . The present sequence is one of the SPTM 

CC proteins of the invention (ABP7 53 84-ABP75 962 ) . Note: The sequence data 

CC for this patent did not form part of the printed specf ication, but was 

CC obtained in electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published__pct_sequences 

XX 

SQ Sequence 349 AA; 

Query Match 27.6%; Score 1860; DB 6; Length 349; 

Best Local Similarity 99.7%; Pred. No. 2.4e-84; 

Matches 348; Conservative 1; Mismatches 0; Indels 0; Gaps 0 



QY 



7 92 GTYGYGCRQICDCLNNSTCDHITGTCYCS PGWKGARCDQAGVI IVGNLNSLSRTSTALPA 8 51 



I I I I I I I I I I ! I I I I * I I I I I I I I I I I 1 I I ! I t I I I I i I I 

1 GTYGYGCRQICDCLNDSTCDHITGTCYCS PGWKGARCDQAGVI IVGNLNSLSRTSTALPA 60 




Db 



Qy 



852 DS YQI GAI AGI 1 1 LVLWLFLLALFI I YRHKQKGKES SMPAVTYTPAMRWNADYTI S GT 911 



Db 



61 D S YQ I GAI AGI 1 1 LVL WL FL LAL FI I YRHKQKGKE S SMP AVT YT PAMRWNAD YT I S GT 120 




QY 



912 LPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKNVNPGKRG 971 
I I I I II I I I I I I I I I I I I i M I I I I I I I I I I I ! II I I I I 



Db 121 LPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLE^VNLKNVNPGKRG 180 

Qy 972 PVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSNCSLSSSENPYA 1031 

1 I I I I I i I I I I I MINI I I I ! I I I I I I I I I I I I I I I I 

Db 181 PVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSNCSLSSSENPYA 240 

Qy 1032 TIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPTVSWQGVFSNN 1091 

I I I M I I I I I I I I I I ! I I I I I I i I I I I i I I I I I I I I I I I I I I I I I M I I I I I I I I 1 I I I I 

Db 241 TIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPTVSWQGVFSNN 300 

Qy 1092 GRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 

I | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 301 GRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 349 



RESULT 13 




ABJ37901 




T T\ 
.L u 


ABJ37901 standard; protein; 1450 AA. 


vv 

AA 






7\ C 


ABJ37901; 




vv 
AA 






JJ 1 


22-MAY-2003 


(first entry) 


VV 
AA 






Dt 


NOVX protein 


sequence SEQ ID No 48. 


V V 

AA 






J\W 


Hepatotropic; 


immunosuppressive; cardiant; hypertensive; tranquilizer; 


T-TTaT 


vulnerary; virucide; antibacterial; protozoacide; fungicide; nootropic; 


T«TTa7 


antiparasitic 


; neuroprotective; cerebroprotecti ve ; antiparkinsonian; 


£\VW 


anticonvulsant; antiaddictive ; analgesic; dermatological ; keratolytic; 


VT/J 
i\W 


antiseborrheic; antirheumatic; antiarthritic; antiinf lairimatory ; anti-HIV; 


JtVvV 


cytostatic; antiasthmatic; antipsoriatic; hypotensive; osteopathic- 




antiulcer; anorectic; antidiabetic; antiallergic; haemostatic; 


rvvv 


neuroleptic; 


antidepressant; antiinf ertility; NOVX; human disease; 


KW 


NOVX-associated disorder; trauma; viral; bacterial; fungal; protozoal; 


KW 


parasitic infection; Alzheimer's disease; stroke; forensic biology; 


KW 


immunogen; non-human transgenic animal; gene therapy. 


XX 






OS 


Unidentified. 




XX 






PN 


WO200281517-A2. 


XX 






PD 


17-OCT-2002 . 




XX 






PF 


22-JAN-2002; 


2002WO-US002064 . 


XX 






PR 


19-JAN-2001; 


2001US-0262892P. 


PR 


23-JAN-2001; 


2001US-0263598P. 


PR 


24- JAN-2001; 


2001US-0263799P. 


PR 


25-JAN-2001; 


2001US-0264117P. 


PR 


25-JAN-2001; 


2001US-0264139P. 


PR 


26-JAN-2001; 


2001US-0264478P. 


PR 


30-JAN-2001; 


2001US-0263351P. 


PR 


02-MAR-2001; 


2001US-0272870P. 


PR 


14-MAR-2001; 


2001US-0275927P. 


PR 


14-MAR-2001; 


2001US-0275990P. 


PR 


15-MAR-2001; 


2001US-0276449P. 


PR 


20-MAR-2001; 


2001US-0277358P. 



PR 


23 


-MAR- 


2001; 


2001US- 


0278151P . 


PR 


2 9 


-MAR- 


2 001; 


2001US- 


0279857P . 


PR 


20 


-APR- 


2 001; 


2001US- 


0285140P . 


PR 


20 


-APR- 


2 001; 


2 001US- 


0285141P . 


PR 


30 


-APR- 


2001 ; 


2001US- 


0287484P . 


PR 


17 


-MAY- 


2001 ; 


2 0 01US- 


0291701P . 


PR 


0 8 


- JUN- 


2 001; 


2 001US- 


0296960P . 


PR 


10 


- JUL- 


2 001; 


2 0 01US- 


0304353P . 


PR 


10 


- JUL- 


2 001; 


2001US- 


0304355P . 


PR 


12 


- JUL- 


2 001; 


2001US- 

C^i W W _L, \J i — ' 


0304886P. 


PR 


09 


-AUG- 


2001; 


2001US- 


0311289P. 


PR 


13 


-AUG- 


2001; 


2001US- 


0311975P. 


PR 


16 


-AUG- 


2001; 


2001US- 


0312937P. 


PR 


18 


-OCT- 


2001; 


2001US- 


0330227P. 


PR 


29 


-NOV- 


2001; 


2001US- 


0334198P. 



XX 

PA (CURA-) CURAGEN CORP. 
XX 

PI Decristofaro MF, Padigaru M, Miller C, Tchernev V, Zhong H; 

PI Zhong M, Anderson D, Ballinger R, Gerlach V, Spytek KA, Rastelli L; 

PI Kekuda R, Guo X, Zerhusen B, Andrew D, Mezes P, Patturajan M; 

PI Burgess CE, Eisen A, Wolenc A, Baumgartner J, Shimkets RA, Gusev V; 

PI Vernet CAM, Taupier RJ, Pena C, Shenoy S, Li L, Casman S, Boldog F; 

PI Fernandes E, Smithson G, Malyankar U, Taillon B, Liu X; 

XX 

DR WPI; 2003-058504/05. 

DR N-PSDB; ABT33366. 
XX 

PT New polypeptides, designated as NOVX, useful for diagnosing and treating 

PT infections, neurological diseases, cancer, allergy, and bone, 

PT immunological, skin, renal, brain, muscle and autoimmune disorders. 

XX 

PS Claim 1; Page 129-130; 672pp; English. 
XX 

CC The invention relates to a novel isolated polypeptide, designated NOVX 

CC (NOV1 - 33), consisting of a mature form of one of 61 sequences, given in 

CC the specification, or its variant, where amino acid residue (s) in the 

CC variant differ from the mature form, provided that the variant differs in 

CC not more than 15 % of the amino acids from the sequence of the mature 

CC form. The NOVX polypeptides, nucleic acids encoding the polypeptides, and 

CC an antibody to the polypeptides, are useful for treating or preventing a 

CC NOVX-associated disorder in humans and for treating a syndrome associated 

CC with a human disease (NOVX-associated disorder) . NOVX polypeptides and 

CC the encoding nucleic acids, are useful for determining the presence of or 

CC predisposition to a disease associated with altered levels of NOVX 

CC polypeptide and polynucleotide, by measuring the level of polypeptide 

CC expression or the amount of nucleic acid from a mammal and comparing it 

CC with another mammal not having or not predisposed to the disease. NOVX 

CC polypeptide is also useful for identifying an agent that binds to NOVX 

CC and a cell expressing NOVX is useful for identifying an agent that 

CC modulates the expression or activity of NOVX. The antibodies and a 

CC polypeptide having 95 % sequence identity to NOVX polypeptide are useful 

CC for treating a pathological state in a mammal. The antibodies are also 

CC useful for determining the presence or amount of NOVX in a sample. NOVX 

CC polypeptides, polynucleotides and antibodies specific for the 

CC polypeptides are useful for treating or preventing disorders or syndromes 

CC including trauma, viral, bacterial, fungal, protozoal, and parasitic 



CC infections. They can also treat disorders such as e.g., Alzheimer's 

CC disease or a stroke. The NOVX encoding nucleic acids are useful for 

CC expressing the NOVX proteins, to detect NOVX mRNA, or a genetic lesion in 

CC a NOVX gene and to modulate NOVX activity. NOVX sequences are also useful 

CC for identifying a cell or tissue type in a biological sample, to amplify 

CC DNA sequences from very small biological samples such as tissues e.g. 

CC hair or skin or body fluids in forensic biology and as primers and probes 

CC for use in identifying and/or cloning NOVX homologues in other cell 

CC types. The NOVX proteins are useful as an immunogen to generate 

CC antibodies which are useful for diagnos tically monitoring protein levels 

CC and modulating NOVX activity. Cells comprising NOVX nucleic acids are 

CC useful for producing non-human transgenic animals which are useful for 

CC studying the function and/or activity of NOVX protein and for identifying 

CC and/or evaluating modulators of NOVX protein activity. The NOVX nucleic 

CC acids can be used in gene therapy. This sequence represents a NOVX 

CC protein of the invention 

XX 

SQ Sequence 1450 AA; 

Query Match 27.6%; Score 1858.5; DB 6; Length 1450; 

Best Local Similarity 42.0%; Pred. No. l.le-83; 

Matches 318; Conservative 83; Mismatches 324; Indels 33; Gaps 12; 



Qy 



94 CCPGFYESGEMC VPHCADKCVHGRCIAPNT CQCEPGWGGTNC 135 

I I I I I I I : I : 1=1 I h h I I 

Db 624 CDPGLY — GRFCHLACPPWAFGPGCSEEC QCVQPHTQSCDKRDGSCSCKAGFRGERC 678 



Qy 136 S S ACDGDHWGPHCT S RCQCKNGALCNP I TGAC — HCAAGFRGWRCEDRCEQGT YGNDCHQ 193 

: | : : : | | | I I I I : : : I I I I I I : I I I I h I : I 

Db 679 QAECELGYFGPGCWQACTCPVGVACDSVSGECGKRCPAGFQGEDCGQECPVGTFGVNCSS 738 

Qy 194 RCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCP-CQNGGVCHHVTGE 252 

II M I 111:11111111 II llhl I : : M I I : I I I 
Db 739 SCSC-GGAPCHGVTGQCRCPPGRTGEDCEADCPEGRWGLGCQEICPACQHAARCDPETGA 7 97 

Qy 253 CSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECP 312 

|| | : : I : I I I I : I : I I I I I I I M I I : I I : I I I I I 
D b 7 98 CLCLPGFVGSRCQDVCPAGWYGPSCQTRCSCANDGHCHPATGHCSCAPGWTGFSCQRACD 8 57 

Qy 313 VGTYGVLCAETCQCVNG-GKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCH 371 

| : | | : || Ml : I I I I I I I I : I II I : I I = I : I I : : I I 

Db 858 TGHWGPDCSHPCNCSAGHGSCDAISGLCLCEAGYVGPRCEQQ-CPQGHFGPGCEQLCQC- 915 

Qv 372 LENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCA 431 

: : : | : I I I I I I I : I I I I : I I : hi I I I h I I I I 

Db 916 -QHGAACDHVSGACTCPAGWRGTFCEHACPAGFFGLDCRSACNCTAGAACDAVNGSCLCP 974 

Qy 432 PGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTW 491 

| : | | : I! I I I II I h I I II I I I M h I M 
Db 975 AGRRGPRCAETCPAGLYGDNCRHSCLCQNGGTCDPVSGHCACPEGWAGLACEKECPPRDV 1034 

Qy 492 GFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCH 551 

| | : I I I I I I : I I I II h I h II I I I h I I M 
Db 1035 RAGCRHSGGCLNGGLCDPHTGRCLCPAGWAGDKCQSPCLRGWPGEACAQHCSCPPGAACH 1094 

Qy 552 PTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTC 611 

| | M | | | : : I I : I I h I I I I I I I I I I I I I I I I I 



Db 


1095 


Qy 


612 


Db 


1155 


Qy 


672 


Db 


1213 


Qy 


732 


Db 


1273 


Qy 


792 


Db 


1333 



1095 HVTGACRCPPGFTGSGCEQGCPPGRYGPGCEQLCGCLNGGSCDAATGACRCPTGFLGTDC 1154 

QRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAG 671 

| | : | | : | 1 : I : I I I I I I I I I I Ml I 

NLTCPQGRFGPNCTHVC-GCGQGAA-CDPVTGTCLCPPGRAGVRCERGCPQNRFGVGCEH 1212 

ICTCTNNGTCNP1DRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGEC 7 31 

| : I I I I : : I I I I I I I III : I I I'lll = I I 

TCSCRNGGLCHASNGSCSCGLGWTGRHCELACPPGRYGAACHLECSCHNNSTGEPATGTC 1272 

KCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPS 791 

:, |l: ! I I I I : i I : I II : I I I I I I I : I I HI M : I 

RCGPGFYGQACEHPCPPGFHGAGCQGLCWCQHGAPCDPISGRCLCPAGFHGHFCERGCEP 1332 

GTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCD 82 9 

| :: I I I I I I I : I I : I I I I II III: 

GSFGEGCHQRCDCDGGAPCDPVTGLCLCPPGRSGATCN 137 0 



RESULT 14 


ABU03489 


T Pi 

X D 


ARTID14RQ standard; nrotein; 739 AA. 


w 
AA 






ARU034 8 9 ; 


V V 
AA 




■nm 

U 1 




vv 
AA 


Angiogenesis-associated human protein sequence #34. 


"T\ TP 


XX 




T/T»7 


u lim2n . anrr-i nrfpnpcii Q-^ciqnriatpd transcriDt ; ancri o crene s i s ; 


"KTaT 
i\VV 


angiogenesis-associated disease; cancer; cytostatic. 


XX 




OS 


Homo sapiens. 


XX 




PN 


WO200279492-A2 . 


XX 




PD 


10-OCT-2002. 


XX 




PF 


14-FEB-2002; 2002WO-US004915 . 


XX 




PR 


14-FEB-2001; 2 001US- 007 8 435 6 . 


PR 


22-FEB-2001; 2001US-007 91390 . 


PR 


19-APR-2001; 2001US-02 85475P . 


PR 


03-AUG-2001; 2 001US-0310025P . 


PR 


13-NOV-2001; 2001US-0350666P . 


PR 


2 9-NOV-2001; 2 001US- 0334244P . 


XX 




PA 


(EOSB-) EOS BIOTECHNOLOGY INC. 


XX 




PI 


Murray R, Glynne R, Watson SR, Aziz N; 


XX 




DR 


WPI; 2003-040681/03. 


DR 


N-PSDB; ABX08773. 


XX 


Detecting angiogenesis-associated transcript in a cell for diagnosing and 


PT 


PT 


treating cancer by contacting a sample with a polynucleotide that 


PT 


exhibits changes in expression level as a function of time in tissue 



PT undergoing angiogenesis . 
XX 

PS Example 2; Page 212; 291pp; English. 
XX 

CC The present invention relates to methods and compositions for detecting 

CC an angiogenesis-associated transcript in a cell in a patient. The method 

CC involves contacting a biological sample from the patient with a 

CC polynucleotide that selectively hybridises to a sequence at least 80% 

CC identical to any of the angiogenesis-associated human polynucleotide 

CC sequences given in the specification. These angiogenesis-associated 

CC polynucleotide sequences comprise genes that exhibit changes in 

CC expression levels as a function of time in tissue undergoing 

CC angiogenesis. The method and the polynucleotide sequences of the 

CC invention are useful for diagnosing and treating angiogenesis and 

CC angiogenesis-associated diseases e.g. cancer. The polynucleotide 

CC sequences are also useful in the gene therapy of such disorders. The 

CC angiogenesis-associated proteins encoded by the polynucleotide sequences 

CC are useful as a vaccine for therapeutic and prophylactic immunisation. 

CC ABU03456-ABU03569 represent angiogenesis-associated protein sequences 

XX 

SQ Sequence 739 AA; 

Query Match 27.4%; Score 1847.5; DB 6; Length 739; 

Best Local Similarity 40.0 p o; Pred. No. 2e-83; 

Matches 364; Conservative 91; Mismatches 240; Indels 215; Gaps 24; 

Qy 2 61 GTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLC 320 

||:| II I II I I I I II I : I I I I I M Mill: : I I : : I I I I I : I I 

Db 1 GTICSLPCPEGFHGPNCSQECRCHNGGLCDRFTGQCRCAPGYTGDRCREECPVGRFGQDC 60 

Qy 321 AETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHP 38 0 

Mi : I : : M I I I I II I : II I I M : I I I : I I I I " M I I 

Db 61 AETCDCAPDARCFPANGACLCEHGFTGDRCTDRLCPDGFYGLSCQAPCTCDREHSLSCHP 120 

Qv 381 MSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCS 44 0 

= 1 I I I : I I : II I : I : I I I : I I : I I : : I I I M I : I I : 

Db 121 MNGECSCLPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGVCQATSGLCQCAPGYTGPHCA 180 

Qy 441 TPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQ 500 

: | | | | | : I II : I I I : I I M : I I I I I I I : I I : I I I I I I I II : I I 
Db 181 SLCPPDTYGVNCSARCSCENAIACSPIDGECVCKEGWQRGNCSVPCPPGTWGFSCNASCQ 240 

Oy 501 CLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCL 560 

| : | : I I I I I II I Mill I : I M I I I I 1 = 111 I MM 

Db 241 CAHEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCDCDHSDGCDPVHGRCQCQ 300 

Oy 561 PGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFY 62 0 

|| | | I II II Ml I MM M MM I lllllll MM I II I 

Db 301 AGWMGARCHLSCPEGLWGVNCSNTCTCKNGGTCLPENGNCVCAPGFRGPSCQRSCQPGRY 360 

Qy 621 GHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGT 68 0 

II I : I 

Db 361 GKR CVP 366 

Ov 681 CNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGL 740 

Y ' * ' " I I I Ml Mil I M I 

b 367 CKCANHSFCHPSNGTCYCLAGWTGP 391 



Qy 


741 


Db 


392 


Qy 


801 


Db 


422 


Qy 


858 


Db 


459 


Qy 


918 


Db 


514 


Qy 


976 


Db 


562 


Qy 


1023 


Db 


621 






Db 


680 


Qy 


1119 


Db 


729 



I : I I I I I I : I : I : III I I II 
^rsORrPT.CTF^ANr^OPCOnGPGEKC HPE 421 



I I I I I I I I I : I ■ I : i : I : I 

-TGACVCPPGHSGAPCR IG 1 Q E P FT VMPT T P VAYN S L G 45 8 



I : I i : I : I : I : I I I I I I I I I I I I Ml: I : = M 



: I I I : : II M I I I M I : M M I I M M II I II 

-SYSHYYSNPSYHTLSQCSPNPPPPNK VPGPLFASLQNPERPG— GAQGHD 561 

PGTLPADWKH GGYLNELGAFGLDRSYMGKSL KDLGKNSEYNSSNCS 1022 

I I I I II I I I I : I : I M I I M I M I 



I M II I ! I I : I I I I : I M I II I : = : 

L-SSENPYATIRDLPSLPGGPRESSYMEMKGPPSGSPPRQPPQFWDSQRRRQPQPQRDSG 67 9 

TSANRNVYEVEPTVSWQGVFSNNGRLSQDP YDLPKNSHIPCHYDLLPVRD 1118 

| | | : | : : : III I I I I I I I I I M I I I I I 



RESULT 15 
ABJ37903 

ID ABJ37903 standard; protein; 1403 AA. 
XX 

AC ABJ37903; 
XX 

DT 22-MAY-2003 (first entry) 
XX 

DE NOVX protein sequence SEQ ID No 52. 
XX 

KW Hepatotropic; immunosuppressive; cardiant; hypertensive; tranquilizer; 

KW vulnerary; virucide; antibacterial; protozoacide ; fungicide; nootropic; 

KW antiparasitic; neuroprotective; cerebroprotective ; antiparkinsonian; 

KW anticonvulsant; antiaddictive ; analgesic; dermatological ; keratolytic; 

KW antiseborrheic; antirheumatic; antiarthri tic ; antiinflammatory; anti-HIV; 

KW cytostatic; antiasthmatic; antipsoriatic; hypotensive; osteopathic; 

KW antiulcer; anorectic; antidiabetic; antiallergic; haemostatic; 

KW neuroleptic; antidepressant; antiinf ertility ; NOVX; human disease; 

KW NOVX-associated disorder; trauma; viral; bacterial; fungal; protozoal; 

KW parasitic infection; Alzheimer's disease; stroke; forensic biology; 

KW immunogen; non-human transgenic animal; gene therapy. 

XX 

OS Unidentified. 



XX 

PN 

XX 

PD 

XX 

PF 

XX 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

PR 

XX 

PA 

XX 

PI 

PI 

PI 

PI 

PI 

PI 

XX 

DR 

DR 

XX 

PT 

PT 

PT 

XX 

PS 

XX 

CC 

CC 

CC 

CC 



WO200281517-A2. 



17-OCT-2002. 



22-JAN-2002; 2002WO-US002064 . 



19-JAN- 

23- JAN- 

24- JAN- 

2 5- JAN- 

25- JAN- 

2 6- JAN- 
30-JAN- 
02-MAR- 
14-MAR- 

14- MAR- 

15- MAR- 
2 0-MAR- 
2 3-MAR- 
2 9-MAR- 
2 0-APR- 

2 0-APR- 

3 0-APR- 

17- MAY- 
0 8-JUN- 
10- JUL- 
IO- JUL- 

12- JUL- 
0 9-AUG- 

13- AUG- 

16- AUG- 

18- OCT- 
2 9 -NOV- 



2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
2001 
•2001 
■2001 
■2001 
-2001 
•2001 
■2001 
■2001 



2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US- 
2001US 
2001US 
2001US 
2001US 



0262892P. 
0263598P. 
0263799P. 
0264117P. 
0264139P. 
•0264478P. 
•0263351P. 
■0272870P. 
■0275927P. 
0275990P. 
•0276449P. 
-0277358P. 
-0278151P. 
-0279857P. 
-0285140P. 
-0285141P. 
-0287484P. 
-0291701P. 
-0296960P. 
-0304353P. 
-0304355P. 
-0304886P. 
-0311289P. 
-0311975P. 
-0312937P. 
-0330227P. 
-0334198P. 



(CURA-) CURAGEN CORP. 

Decristofaro MF, Padigaru M, Miller C, Tchernev V, Zhong H; 



Zhong M, Anderson D, Ballinger R, Gerlach V, 
Kekuda R, Guo X, Zerhusen B, Andrew D, Mezes 
Burgess CE, Eisen A, Wolenc A, Baumgartner J, 
Vernet CAM, Taupier RJ, Pena C, Shenoy S, Li L, 



Spytek KA, Rastelli L; 
P, Patturajan M; 
Shimkets RA, Gusev V; 
Casman S, Boldog F; 



Fernandes E, Smithson G, Malyankar U, Taillon B, Liu X; 

WPI; 2003-058504/05. 
N-PSDB; ABT33368. 

New polypeptides, designated as NOVX, useful for diagnosing and treating 
infections, neurological diseases, cancer, allergy, and bone, 
immunological, skin, renal, brain, muscle and autoimmune disorders. 

Claim 1; Page 133; 672pp; English. 

The invention relates to a novel isolated polypeptide, designated NOVX 
(NOV1 - 33), consisting of a mature form of one of 61 sequences, given in 
the specification, or its variant, where amino acid residue (s) in the 
variant differ from the mature form, provided that the variant differs in 



CC not more than 15 % of the amino acids from the sequence of the mature 

CC form. The NOVX polypeptides, nucleic acids encoding the polypeptides, and 

CC an antibody to the polypeptides, are useful for treating or preventing a 

CC NOVX-associated disorder in humans and for treating a syndrome associated 

CC with a human disease (NOVX-associated disorder) . NOVX polypeptides and 

CC the encoding nucleic acids, are useful for determining the presence of or 

CC predisposition to a disease associated with altered levels of NOVX 

CC polypeptide and polynucleotide, by measuring the level of polypeptide 

CC expression or the amount of nucleic acid from a mammal and comparing it 

CC with another mammal not having or not predisposed to the disease. NOVX 

CC polypeptide is also useful for identifying an agent that binds to NOVX 

CC and a cell expressing NOVX is useful for identifying an agent that 

CC modulates the expression or activity of NOVX . The antibodies and a 

CC polypeptide having 95 % sequence identity to NOVX polypeptide are useful 

CC for treating a pathological state in a mammal. The antibodies are also 

CC useful for determining the presence or amount of NOVX in a sample. NOVX 

CC polypeptides, polynucleotides and antibodies specific for the 

CC polypeptides are useful for treating or preventing disorders or syndromes 

CC including trauma, viral, bacterial, fungal, protozoal, and parasitic 

CC infections. They can also treat disorders such as e.g., Alzheimer's 

CC disease or a stroke. The NOVX encoding nucleic acids are useful for 

CC expressing the NOVX proteins, to detect NOVX rnRNA, or a genetic lesion in 

CC a NOVX gene and to modulate NOVX activity. NOVX sequences are also useful 

CC for identifying a cell or tissue type in a biological sample, to amplify 

CC DNA sequences from very small biological samples such as tissues e.g. 

CC hair or skin or body fluids in forensic biology and as primers and probes 

CC for use in identifying and/or cloning NOVX homologues in other cell 

CC types. The NOVX proteins are useful as an immunogen to generate 

CC antibodies which are useful for diagnostically monitoring protein levels 

CC and modulating NOVX activity. Cells comprising NOVX nucleic acids are 

CC useful for producing non-human transgenic animals which are useful for 

CC studying the function and/or activity of NOVX protein and for identifying 

CC and/or evaluating modulators of NOVX protein activity. The NOVX nucleic 

CC acids can be used in gene therapy. This sequence represents a NOVX 

CC protein of the invention 

XX 

SQ Sequence 1403 AA; 

Query Match 26.2%; Score 1770; DB 6; Length 1403; 

Best Local Similarity 35.9%; Pred. No. 2.4e-79; 

Matches 317; Conservative 74; Mismatches 345; Indels 146; Gaps 13 

GEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRC-IAPNTCQCEPGWGGTNCSSACDG 141 

III: II : I I M II:: • : I 

GEHTL-TEKFVCLDDSF — GHDCSLTCDDCRNGGTCLLGLDGCDCPEGWTGLICNESCPP 559 

DHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGA 2 01 

I : | : | : I I : I I I : : I I I I I =111 I :*l I I 

DTFGKNCSFSCSCQNGGTCDSVTGACRCPPGVSGTNCEDGCPKGYYGKHCRKKCNCANRG 619 

TCDHVTGECRCPPGYTGAFCEDLCP 226 

| : | | I II I II II 
RCHRLYGACLCDPGLYGRFCHLACPPWAFGPGCSEECQCVQPHTQSCDKRDGSCSCKAGF 67 9 

PGKHGPQCEQRCPCQNGGVCHHVTGECS— CPSGWMGTVCGQPCPEGRFG 27 4 

|| 1:111 M : I : I I I I I I I I I 



Qy 


83 


Db 


503 


Qy 


142 


Db 


560 


Qy 


202 


Db 


620 


Qy 


227 


Db 


680 



Qy 275 KNCSQECQ CHNGGTCDA 291 

III i HI II 

Db 740 VNCSSSCSCGGAPCHGVTGQCRCPPGRTGEDCEAGECEGLWGLGCQEICPACHNAARCDP 799 

Qy 292 ATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYH 334 

II I I II: I MM I III I Mill: 

Db 800 ETGACLCLPGFVGSRCQD-CEAGWYGPSCQTMCSCANDGHCHQDTGHCSCAP GWTGFS CQ 858 



Qy 



335 VS GACLCEAGFAGERCEARLCPEGL YGI KCDKR 367 

: I I I I I I I I : I I I I I I : I : I I : : I 
Db 859 RACDTGHWGPDCSHPCNCSAGHGSCDAISGLCLCEAGYVGPRCEQSECPQGHFGPGCEQR 918 



Qy 368 CPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGK 427 

|| : : : | : I I I I I I I : I I I I : I I : I : I I I M : I I 

Db 919 CQC--QHGAACDHVSGACTCPAGWRGTFCEHACPAGFFGLDCRSACNCTAGAACDAVNGS 97 6 

Q y 428 CTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCP 487 

M 1:1 I : M I I I III I I I I I I I I I I III I II 
Db 977 CLCPAGRRGPRCAETCPAHTYGHNCSQACAC FNGASCDPVHGQCHCAPGWMGPSCLQACP 1036 

Q y 488 SGTWGFGCNLTCQCLNGGACNTLDGTCTCAP GWRGEKCELPCQDGTYGLNCAERCDCSHA 54 7 

: I : I I : : : I I I I I I M : I I I : 

Db 1037 AGLYGDNCRHSCLCQNGGTCDPVSGHCACPEGWAGLACEVECLPRDVRAGCRHSGGCLNG 1096 

Qy 548 DGCHPTTGHCRCLPGWSGVHCDS--VCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPG 605 

| | M | I I I : I M II : I : I I : I I h I I I I I I I 

Db 1097 GLCDPHTGRCLCPAGWTGDKCQSPAACAKGTFGPHCEGRCACRWGGPCHLATGACLCPPG 1156 

Qy 606 FRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRF 665 

: | | | | : : | |:| I I : I I I : I I I I I I I I I : I : I I I I 

Db 1157 WRGPHLSAACLRGWFGEACAQRC-SCPPGAA-CHHVTGACRCPPGFTGSGCEQACPPGSF 1214 

Qy 666 GKNCAGICTCT-NNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFC 724 

| : : I I : I ! I hi Ml M I Ml I I I I I I 

Db 1215 GEDCAQMCQCPGENPACHPATGTCSCAAGYHGPSCQQRCPPGRYGPGCEQLCGCLNGGSC 1274 

Q y 725 SAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRH 784 

| MM 1:1 I M I M M Ml II II :M I I I I 
Db 1275 DAATGACRCPTGFLGTDCNLTCPQGRFGPNCTHVCGCGQGAACDPVTGTCLCPPGRAGVR 1334 

Qy 785 CEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGA 826 

M : I I MM III I I I I I 

Db 1335 CERGCPQNRFGVGCEHTCSCRNGGLCHASKRQLLLWPGLDGA 137 6 



Search completed: March 26, 2004, 16:08:52 
Job time : 78.3326 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: March 26, 2004, 16:06:56 ; Search time 25.759 Seconds 

(without alignments) 
2284.780 Million cell updates/sec 



Title: US-10-092-3 90-2 

Perfect score: 6744 

Sequence: 1 MVI SLNSCLSFICLLLCHWI SSPKQEDSGGSSSNSSSSSE 1140 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



389414 



Database 



Issued_Patents_AA: * 

1: /cgn2_6/ptodata/2/iaa/5A_COMB.pep: * 

2: /cgn2_6/ptodata/2/iaa/5B_COMB.pep: * 

3: /cgn2_6/ptodata/2/iaa/6A_COMB.pep: * 

4 : /cgn2_6/ptodata/2/iaa/6B_COMB .pep : * 

5: /cgn2_6/ptodata/2/iaa/PCTUS_COMB.pep:* 

6: /cgn2_6/ptodata/2/iaa/backf ilesl.pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 

US-08-185-432-18 

Sequence 18, Application US/08185432 
Patent No. 5750652 
GENERAL INFORMATION: 

APPLICANT : Artavanis-Tsakonas , Spyridon 
APPLICANT: Busseau, Isabelle 
APPLICANT: Diederich, Robert J. 
APPLICANT: Xu, Tian 
APPLICANT: Matsuno, Ken j i 

TITLE OF INVENTION: DELTEX PROTEINS, NUCLEIC ACIDS, AND 
TITLE OF INVENTION: ANTIBODIES, AND RELATED METHODS AND COMPOSITIONS 
NUMBER OF SEQUENCES: 23 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: PENNIE & EDMONDS 
STREET: 1155 Avenue of the Americas 
CITY: New York 
STATE: New York 



COUNTRY: U.S.A. 
ZIP: 10036-2711 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC~DOS/MS~DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/185, 432 
FILING DATE: 21-JAN-1994 
CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
NAME: Misrock, S. Leslie 
REGISTRATION NUMBER: 18,872 
REFERENCE/ DOCKET NUMBER: 7326-006 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (212) 790-9090 
TELEFAX: (212) 8 69-8864/9741 
TELEX: 66141 PENNIE 
INFORMATION FOR SEQ ID NO: 18: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2 52 3 amino acids 
TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
US-08-185-432-18 

Query Match 15.4%; Score 1037; DB 1; Length 2523; 

Best Local Similarity 25.4%; Pred. No. 1.5e-60; 

Matches 326; Conservative 84; Mismatches 304; Indels 568; Gaps 78; 



Qy 



8 3 GEKTMYRR KSQC CP-GFYESGEMCVPHCADKCVHGR 117 

||: : |:|| MM : : : M : I I : 

Db 53 GERCQFPNPCTIKNQCMNFGTCEPVLQGNAIDFICHCPVGF— TDKVCLTPVDNACVNNP 110 

0v 118 c IAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNP— IT 164 

| : I : I I II I : I I I I Ml II I 

Db HI CRNGGTCELLNSVTEYKCRCPPGWTGDSCQQA DPCASN-PCANGGKCLPFEIQ 162 

Qv 165 GACHCAAGFRGWRCE DRCEQ GTYGNDCHQR CQ 196 

I I II I I: : I I Mill I 

Db i 6 3 YICKCPPGFHGATCKQDINECSQNPCKNGGQCINEFGSYRCTCQNRFTGRNCDEPYVPCN 222 

197 CQNGATC DHVTGECRCPPGYTGAFCED LC 225 

I M I I I : : I I I I = : I I I : 1 

PSPCLNGGTCRQTDDTSYDCTCLPGFSGQNCEENIDDCPSNNCRNGGT CVDGVNTYNCQC 

226 PPGKHGPQCEQ— RC PCQNGGVCHHVTG — ECSCPSGWMGTVCGQ 266 

|| ||: I Mill I M I I I : I I I I : 

Db 2 83 PPDWTGQYCTEDVDECQLMPNACQNGGTCHNTYGGYNCVCVNGWTGEDCSENIDDCANAA 3 42 



Qy 

Db 223 
QY 



282 



Qy 



267 PCPEGRFGKNC— SQEC— QCHNGGTCDA— ATGQ-- CHCSPG 301 

I I II I I I Mill I: I I II 

Db 343 CHSGATCHDRVASFYCECPHGRTGLLCHLDNACISNPCNEGSNCDTNPVNGKAICTCPPG 402 

Qv 302 YTGERCQ— -DECPVGTYGVLCAETCQCVNGGKCYHVSGA—CLCEAGFAGERCEARLCP 356 

| | | | Ml M I : I I : I : M I I M I I i I I • 



Db 



403 YTGPACNNDVDECSLGAN PCEHGGRCTNTLGSFQCNCPQGYAGPRCEIDV 452 



357 EGLYGIKCDKRC PCHLENTHSCHPMSGE CACKPGWSGLYC 396 

| || : I : I II I I I I : I I ! I 

Db 453 N ECLSNPC--QNDSTCLDQIGEFQCICMPGYEGLYCETNIDECASNPCLHN 501 



Qy 



397 NE TCSPGFYGEACQ QICS CQNGADC DSVTGK- 427 

I Is I : I I I I : I I s 

Db 502 GKCIDKINEERCDCPTGFSGNLCQHDFDECTSTPCKNGAKCLDGPNSYTCQCTEGFTGRH 561 



Q y 428 CTCAPGFKGIDC STP 442 

I I I I : I I M 
Db 562 CEQDINECI PDPCHYGTCKDGIATFTCLCRPGYTGRLCDNDINECLSKPCLNGGQCTDRE 621 

Qy 443 CPLGTYGINCSSR CG CKNDAVCSPVDG-SCTCKAGWHGVDCSIR 485 

11111:11:: | II : I I I I I : I : I I : I 

Db 622 NGYICTCPKGTTGVNCETKIDDCASNLCDNGKCIDKIDGYECTCEPGYTGKLCNININEC 681 



Qy 



486 CPSGTWGFGC 495 

I I 1:1 

Db 682 DSNPCRNGGTCKDQINGFTCVCPDGYHDHMCLSEVNECNSNPCIHGACHDGVNGYKCDCE 741 



Qy 



496 NLTCQ CLNGGACNTLDGT--CTCAPGWRGEKCEL PC-Q 530 

II: I : I I I I : I Nihil: II 
Db 742 AGWSGSNCDINNNECESNPCMNGGTCKDMTGAYICTCKAGFSGPNCQTNINECSSNPCLN 8 01 



Qy 



531 DG T YGLNC AERCD CSHADGCHPT TGHCRCLPGWS 564 

II III h I : I : I 

Db 8 02 HGTCIDDVAGYKCNCMLPYTGAICEAVLAPCAGSPCKNGGRCKESEDFETFSCECPPGWQ 8 61 



Qy 



565 GVHCD SVCAEGRWGPNCSL PCYCKNGASC 593 

||: I I I I I : I I I I I I 

Db 862 GQTCEIDMNECVNRPCRNGATCQNTNGSYKCNCKPGYTGRNCEMDIDDCQPNPCHNGGSC 921 



Qy 



594 SPDDGI CECAPGFRGTTCQR ICSPGFYGHRC 624 

I I I I I I I N I h I I I I I I 

Db 922 S--DGINMFFCNCPAGFRGPKCEEDINECASNPCKNGANCTDCVNSYTCTCQPGFSGIHC 97 9 



Qy 



625 SQTCPQCVHSS GPCHHITGL CDCLPGFTGALC NE 658 

II II lllh MINI: II 

Db 980 ESNTPDCTESSCFNGGTC — IDGINTFTCQCPPGFTGSYCQHDINECDSKPCLNGGTCQD 1037 



Qy 



659 VCP S GRFGKNCAGI CTCTNNGTC NPIDRSCQCYPGWIGSDCSQP 702 

| | | | I I : MM I M : I I M I I 

Db 1038 SYGTYKCTCPQGYTGLNCQNLVRWCDSSPCKNGGKCWQTNNFYR-CECKSGWTGVYCDVP 1096 



Q y 703 CP PA — HWGPNCIHTCNCHNGAFC — SAYDGECKCTPGWTGLYCTQRCPLGFYGKDC 755 

|| | : : I II I : I : I I : I I I M : I 
Db 1097 SVSCEVAAKQQGVDIVHL— CRNSGMCVDTGNTHFCRCQAGYTGSYCEEQV DEC 114 8 



Qy 



756 ALICQCQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

: | | | | | | | : : I I I M : M : 

Db H49 s-PNPCQNGATCTDYLGGYSCECVAGYHGVNCSEEINECLSHPCQNGGTCIDLINTYKCS 1207 



Qy 



789 CPSGTYGYGCRQICD CLNNSTC-DHITG-TCYCSPGWKGARCDQAG 832 

| M II I I I I I I I : I I I I I : M h 
Db 1208 CPRGTQGVHCEINVDDCTPFYDSFTLEPKCFNNGKCIDRVGGYNCICPPGFVGERCE 1264 



Qy 833 VIIVGNLNS-LSRTSTALPADS 853 

I : : I I I ill 
Db 1265 GDVNECLSN PCDS 1277 



RESULT 2 
US-08-899-232-3 

Sequence 3, Application US/08899232 
Patent No. 6436650 
GENERAL INFORMATION: 
APPLICANT : Artavanis-Ts akonas , Spyridon 
APPLICANT: Qi, Huilin 

TITLE OF INVENTION: ACTIVATED FORMS OF NOTCH AND METHODS BASED THEREON 
FILE REFERENCE: 7326-046 

CURRENT APPLICATION NUMBER: US/08/899, 232 
CURRENT FILING DATE: 1997-07-23 
NUMBER OF SEQ ID NOS : 4 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 3 
LENGTH: 2523 
TYPE: PRT 

ORGANISM: Xenopus sp . 
US-08-899-232-3 

Query Match 15.4%; Score 1037; DB 4; Length 2523; 

Best Local Similarity 25.4%; Pred. No. 1.5e-60; 

Matches 326; Conservative 84; Mismatches 304; Indels 568; Gaps 78; 



Qy 



Qy 

Db 

Qy 



8 3 GEKTMYRR KSQC CP-GFYESGEMCVPHCADKCVHGR 117 

||:: 1:11 || | I : : : I : : I I : 

Db 53 GERCQFPNPCTIKNQCMNFGTCEPVLQGNAIDFICHCPVGF— TDKVCLTPVDNACVNNP 110 

0v 118 c IAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNP--IT 164 

| : |:| III I =1 1 II IN I I I 

Db in CRNGGTCELLNSVTEYKCRCPPGWTGDSCQQA DPCASN-PCANGGKCLPFEIQ 162 

165 GACHCAAGFRGWRCE DRCEQ GTYGNDCHQR CQ 196 

II III I : : I I I : I I I I 

163 YICKCPPGFHGATCKQDINECSQNPCKNGGQCINEFGS YRCTCQNRFTGRNCDEPYVPCN 222 

i97 CQNGATC DHVTGECRCPPGYTGAFCED LC 22 5 

Mill I : : I I I I : : I I I : 1 

Db 223 PSPCLNGGTCRQTDDTSYDCTCLPGFSGQNCEENIDDCPSNNCRNGGTCVDGVNTYNCQC 282 

Qy 226 PPGKHGPQCEQ RC PCQNGGVCHHVTG--ECSCPSGWMGTVCGQ 266 

|| ||: I I I I I I I I : I IhIM I : 

Db 283 PPDWTGQYCTEDVDECQLMPNACQNGGTCHNTYGGYNCVCVNGWTGEDCSENIDDCANAA 342 

0v 267 PCPEGRFGKNC— SQEC— QCHNGGTCDA— ATGQ— CHCSPG 301 

| I I I I i I I : I M I : 1 I M 

Db 343 CHSGATCHDRVASFYCECPHGRTGLLCHLDNACISNPCNEGSNCDTNPVNGKAICTCPPG 402 

Qy 302 YTGERCQ DECPVGTYGVLCAETCQCVNGGKCYHVSGA--CLCEAGFAGERCEARLCP 356 

Ml I Ml :| I :l 1:1 : h I I |:|| Ml : 

Db 4 03 YTGPACNNDVDECSLGAN PCEHGGRCTNTLGSFQCNCPQGYAGPRCEIDV-- 4o2 



Qy 357 EGLYGIKCDKRC PCHLENTHSCHPMSGE — CACKPGWSGLYC 396 

1 II : I : i II 1111:1111 

Db 453 NECLSNPC — QNDSTCLDQIGEFQCI CMPGYEGLYCETNIDECASNPCLHN 501 

Qy 397 NE TCSPGFYGEACQ QICS CQNGADC DSVTGK- 427 

II I I I I II i : I : I i I I -II: 

Db 502 GKCIDKINEFRCDCPTGFSGNLCQHDFDECTSTPCKNGAKCLDGPNSYTCQCTEGFTGRH 561 

Qy 428 CTCAPGFKGIDC STP 442 

I I I I : I I M 
Db 5 62 CEQDINECI PDPCHYGTCKDGIATFTCLCRPGYTGRLCDNDINECLSKPCLNGGQCTDRE 621 

Q y 443 CPLGTYGINCSSR CG CKNDAVCS PVDG-SCTCKAGWHGVDCSIR 485 

I I I I I : I I : : I II : I I I I I : I : I I : I 

Db 622 NGYICTCPKGTTGVNCETKIDDCASNLCDNGKCIDKIDGYECTCEPGYTGKLCNININEC 681 

q 486 CPSGTWGFGC 495 

I I I: I 

Db 682 DSNPCRNGGTCKDQINGFTCVCPDGYHDHMCLSEVNECNSNPCIHGACHDGVNGYKCDCE 741 



Qy 



496 NLTCQ CLNGGACNTLDGT — CTCAPGWRGEKCEL PC-Q 530 

II: I : II I I : I III h I h I I 

Db 742 AGWSGSNCDINNNECESNPCMNGGTCKDMTGAYICTCKAGFSGPNCQTNINECSSNPCLN 801 



Qy 



531 DGT YGLNC AERCD CSHADGCHPT TGHCRCLPGWS 564 

|| | M I: I : I : I I I III 

Db 8 02 HGTCIDDVAGYKCNCMLPYTGAICEAVLAPCAGSPCKNGGRCKESEDFETFSCECPPGWQ 8 61 



QY 



565 GVHCD SVCAEGRWGPNCSL PCYCKNGASC 593 

||: I I I I I : I I I I I I 

Db 862 GQTCEIDMNECVNRPCRNGATCQNTNGSYKCNCKPGYTGRNCEMDIDDCQPNPCHNGGSC 921 



Q y 594 SPDDGI CECAPGFRGTTCQR ICSPGFYGHRC 624 

I I I I Mill: I II I I I 

Db 922 S--DGINMFFCNCPAGFRGPKCEEDINECASNPCKNGANCTDCVNSYTCTCQPGFSGIHC 97 9 



Qy 



625 SQTCPQCVHSS GPCHHITGL CDCLPGFTGALC NE 658 

|| M II I 1 : I I II I I I : I II 

Db 980 ESNTPDCTESSCFNGGTC— IDGINTFTCQCPPGFTGSYCQHDINECDSKPCLNGGTCQD 1037 



Qy 



659 VCPSGRFGKNCAGI CTCTNNGTC NPIDRSCQCYPGWIGSDCSQP 702 

| | | | I I : I I I I I I I : I I I ! I I 

Db 1038 SYGTYKCTCPQGYTGLNCQNLVRWCDSSPCKNGGKCWQTNNFYR-CECKSGWTGVYCDVP 1096 



Qy 7 03 CPPA — HWGPNCIHTCNCHNGAFC — SAYDGECKCTPGWTGLYCTQRCPLGFYGKDC 755 

|| | : : I II I : I : I I : I I I I : : : I 

Db 1097 SVSCEVAAKQQGVDIVHL— CRNSGMCVDTGNTHFCRCQAGYTGSYCEEQV DEC 1148 



QY 



756 ALICQCQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

I I I I I I I :: I M I : I : I : : 

Db H49 S-PNPCQNGATCTDYLGGYSCECVAGYHGVNCSEEINECLSHPCQNGGTCIDLINTYKCS 1207 



Qy 



789 CPSGTYGYGCRQICD CLNNSTC-DHITG-TCYCSPGWKG7VRCDQAG 832 

1 | | | I I I III I M I M II: I lh 
Db 1208 CPRGTQGVIICEINVDDCTPFYDSFTLEPKCFNNGKCIDRVGGYNCICPPGFVGERCE 1264 



Qy 



833 VIIVGNLNS-LSRTSTALPADS 853 



I : : I II IN 
Db 1265 GDVNECLSN PCDS 1277 



RESULT 3 

US-08-185-432-17 

Sequence 17, Application US/08185432 
Patent No. 5750652 
GENERAL INFORMATION: 

APPLICANT : Artavanis-Tsakonas , Spyridon 
APPLICANT: Busseau, Isabelle 
APPLICANT: Diederich, Robert J. 
APPLICANT: Xu, Tian 
APPLICANT: Matsuno, Kenji 

TITLE OF INVENTION: DELTEX PROTEINS , NUCLEIC ACIDS, AND 
TITLE OF INVENTION: ANTIBODIES, AND RELATED METHODS AND COMPOSITIONS 
NUMBER OF SEQUENCES: 23 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: PENNIE & EDMONDS 
STREET: 1155 Avenue of the Americas 
CITY: New York 
STATE: New York 
COUNTRY: U.S.A. 
ZIP: 10036-2711 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 185 , 432 
FILING DATE: 21-JAN-1994 
CLASSIFICATION: 530 
ATTORNEY/ AGENT INFORMATION: 
NAME: Misrock, S. Leslie 
REGISTRATION NUMBER: 18,872 
REFERENCE/DOCKET NUMBER: 7326-006 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (212) 790-9090 
TELEFAX: (212) 8 69-8864/9741 
TELEX: 66141 PENNIE 
INFORMATION FOR SEQ ID NO: 17: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2556 amino acids 
TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
US-08-185-432-17 

Query Match 15.4%; Score 1035.5; DB 1; Length 2556; 

Best Local Similarity 25.9%; Pred. No. 1.9e-60; 

Matches 317; Conservative 84; Mismatches 304; Indels 519; Gaps 74; 

Qy 94 CCPGFYESGEMCVPHCADKCVHGRC IAPNTCQCEPGWGGTNCSSACDGDH 143 

I I I I I : I : : I : I : I : I I I I I : I I 

Db 89 CALGF--SGPLCLTPLDNACLTNPCRNGGTCDLLTLTEYKCRCPPGWSGKSCQQA 141 



Qy 144 WGPHCTSRCQCKNGALCNPITGA — CHCAAGFRG WRCEDRCEQG TYGNDCHQ- 193 

II I I I I I : I I I II I : : I I : I I I 

Db 142 — DPCASN-PCANGGQCLPFEASYICHCPPSFHGPTCWQDVNECGQKPRLCRHGGTCHNE 198 

Qy 194 rc QCQNGATC DHVTGECRCPPGYTGAFCE 222 

M I I I I I I I 1 I I I I I : I I I I 

Db 199 VGSYRCVCRATHTGPNCEWPYVPCSPSPCQNGGTCRPTGDVTHECACLPGFTGQNCEENI 258 

Qy 223 DLCPPG--KHGPQC EQRCP CQNGGVCHHVTG- 251 

Mi I : I i II I I I I I I I : I 

Db 259 DDCPGNNCKNGGACVDGVNTYNCPCPPEWTGQYCTEDVDECQLMPNACQNGGTCHNTHGG 318 

Qy 252 -ECSCPSGWMGTVCGQ PCPEGRFGKNC--SQEC 281 

I I : I I I I : I I I I I I : I 

Db 319 YNCVCVNGWTGEDCSENIDDCASAACFHGATCHDRVASFYCECPHGRTGLLCHLNDACIS 37 8 

Qy 2 82 -QCHNGGTCDA--ATGQ--CHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCY 333 

hill h I I I I I I I I I i : I I : I I I 

Db 37 9 NPCNEGSNCDTNPWGKAICTCPSGYTGPACSQDVDECSLGAN PCEHAGKCI 4 30 

Qy 334 HVSGA--CLCEAGFAGERCEARLCPEGLYGIKCDKRC PCHLENTHSCHPMSGE--CA 38 6 

: I : I I I : I I I I : I I I : I : I Ml 

Db 431 NTLGSFECQCLQGYTGPRCEIDV NECVSNPC--QNDATCLDQIGEFQCM 477 

Qy 387 CKPGWSGLYC NE TCSPGFYGEACQ QICS C 415 

| | | : |: : I M I I I I I I hi 

Db 47 8 CMPGYEGVHCEVNTDECASSPCLHNGRCLDKINEFQCECPTGFTGHLCQYDVDECASTPC 537 

Q y 416 QNGADC DSV-TGKCTCAPGFKG 4 36 

: I I I I : i 

Db 538 KNGAKCLDGPNTYTCVCTEGYTGTHCEVDIDECDPDPCHYGSCKDGVATFTCLCRPGYTG 597 

Qy 437 IDCST PCPL GTYGINCS SRCGCKNDAVCS 465 

I I I I I I I I I I : I : 

Db 598 HHCETNINECSSQPCRLWGTCQDPDNAYLCFCLKGTTGPNCEINLDDCASSPCDSGTCLD 657 

Qy 466 PVDG-SCTCKAGWHGVDCSIR CPSGTWGFGCNL TC 499 

: I I I h h I I: I I I I I II 

Db 65 8 KIDGYECACEPGYTGSMCNSNIDECAGNPCHNGGTCEDGINGFTCRCPEGYHDPTCLSEV 717 

Qy 500 QCLNGGACNTLDG-TCTCAPGWRGEKCEL PCQDGTYGLNC 538 

h : I : : I : I I I I I I I I : : I : I I h I 

Db 718 NECNSNPCVHGACWDSLNGYKCDCDPGWSGTNCDINNNECESNPCVNGGTCKDMTSGIVC 777 

Qy 539 a ERC DCSHADGC-HPTTGH-CRCLPGWSGVHCDSV CAEG- 57 5 

I I : I | : | | | : : | | : I I I 

Db 778 TCWEGFSGPNCQTNINECASNPCLNKGTCIDDVAGYKCNCLLPYTGATCEWLAPCAPSP 837 

Qy 576 -RWGPNC SLPCYC KNGASCSPDDGICECAPGFRGTTCQRI CS 616 

| | I III I : I I | : | : I : I I I 

Db 838 CRNGGECRQSEDYESFSCVCPTAGAKGQTCEVDINECVLSPCWHGASCQNTHGXYRCHCQ 8 97 

Qy 617 PGFYGHRCSQTCPQC VHSSGPCHH — ITGLCDCLPGFTGALCNE 658 

|: | I | hi! I I I I I I I I I I I 

Db 898 AGYSGRNCETDIDDCWPNPCHNGGSCTDGINTAFCDCLPGFWGTFCEEDINECASDPCRN 957 



QY 



659 



■VCPSGRFGKNCAG ICT CTNNGTCNPIDR SCQCYPGWI 695 



I I : I |:| II I I I I I : I : I I I I : 

Db 958 GANCTDCVDSYTCTCPAGFSGIHCENNTPDCTESSCFNGGTC— VDGINSFTCLCPPGFT 1015 

Qy 696 GSDC SQP CPPAHWGPNC IHTCN CHNGAFC 724 

Ml | : | I I : I I I I : I I : I I I I 

Db 1016 GSYCQHWNECDSRPCLLGGTCQDGRGLHRCTCPQGYTGPNCQNLVHWCDSSPCKNGGKC 1075 

Qy 725 SAYDGECKCTPGWTGLYCTQ R 745 

: I I : I I I I I I I I I 
Db 1076 WQTHTQY--RCECPSGWTGLYCDVPSVSCEVAAQRQGVDVARLCQHGGLCVDAGNTHHCR 1133 

Qy 746 CPLGFYGKDCALI CQ CQNGADC-DHI SG-QCTCRTGFMGRHCEQK 788 

I I : I I : I I I I II I i : : i II I : I : I : : 

Db 1134 CQAGYTGSYCEDLVDECSPSPCQNGATCTDYLGGYSCKCVAGYHGVNCSEEIDECLSHPC 1193 

Qy 789 CPSGTYGYGCRQICD CLNNSTC-DHITG-T 816 

II I I I I I : I = 

Db 1194 QNGGTCLDLPNTYKCSCPWGTQGVHCEINVDDCNPPVDPVSWSPKCFNNGTCVDQVGGYS 1253 

Qy 817 C YCS PGWKGARCDQAGVI I VGNLN 84 0 

1111:111: | : : | 

Db 1254 CTCPPGFVGERCE GDVN 1270 



RESULT 4 
US-08-899-232-2 

; Sequence 2, Application US/08899232 
; Patent No. 6436650 
; GENERAL INFORMATION: 

; APPLICANT: Artavanis-Ts akonas , Spyridon 

APPLICANT: Qi, Huilin 
; TITLE OF INVENTION: ACTIVATED FORMS OF NOTCH AND METHODS BASED THEREON 
; FILE REFERENCE: 7326-046 

; CURRENT APPLICATION NUMBER: US/ 08/ 8 99 , 232 
; CURRENT FILING DATE: 1997-07-23 
; NUMBER OF SEQ ID NOS : 4 

SOFTWARE: Patentln Ver. 2.0 
; SEQ ID NO 2 
; LENGTH: 25 5 6 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-08-899-232-2 

Query Match 15.4%; Score 1035.5; DB 4; Length 2556; 

Best Local Similarity 25.9%; Pred. No. 1.9e-60; 

Matches 317; Conservative 84; Mismatches 304; Indels 519; Gaps 74 

Qy 94 CCPGFYESGEMCVPHCADKCVHGRC IAPNTCQCEPGWGGTNCSSACDGDH 143 

I I I I I : I : : I : I : 1 = 1 I 

Db 8 9 CALGF--SGPLCLTPLDNACLTNPCRNGGTCDLLTLTEYKCRCPPGWSGKSCQQA 141 

Qy 144 WGPHCTSRCQCKNGALCNPITGA--CHCAAGFRG WRCEDRCEQG TYGNDCHQ- 193 

II I I I I I : I I I i I I : : I I : I II 

Db 142 --DPCASN-PCANGGQCLPFEASYICHCPPSFHGPTCWQDVNECGQKPRLCRHGGTCHNE 198 

QCQNGATC DHVTGECRCPPGYTGAFCE 222 

III Illlhll II 



Qy 194 RC 

I I 



Db 



199 VGSYRCVCRATHTGPNCEWPYVPCSPSPCQNGGTCRPTGDVTHECACLPGFTGQNCEENI 258 



Qy 223 DLCPPG— KHGPQC EQRCP CQNGGVCHHVTG- 251 

ill I : I I M I I : I 

Db 259 DDCPGNNCKNGGACVDGVNTYNCPCPPEWTGQYCTEDVDECQLMPNACQNGGTCHNTHGG 318 

Qy 252 -ECSCPSGWMGTVCGQ PCPEGRFGKNC SQEC-- 281 

I I : I I I I : I I II I I : I 

Db 319 YNCVCVNGWTGEDCSENIDDCASAACFHGATCHDRVASFYCECPHGRTGLLCHLNDACIS 378 

Qy 282 -QCHNGGTCDA — ATGQ — CHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCY 333 

I : I I I I : I I I I I I I I M : I I : I I I 

Db 379 NPCNEGSNCDTNPVNGKAI CTCPSGYTGPACSQDVDECSLGAN PCEHAGKCI 430 

Qy 334 HVSGA--CLCEAGFAGERCEARLCPEGLYGIKCDKRC PCHLENTHSCHPMSGE — CA 386 

: I : I I I : I I II : I I I : i : I III 

Db 431 NTLGSFECQCLQGYTGPRCEIDV NECVSNPC — QNDATCLDQIGEFQCM 477 

Qy 387 CKPGWSGLYC NE TCSPGFYGEACQ QICS C 415 

| | | : | : : | II I I I I I I I : I 

Db 478 CMPGYEGVHCEVNTDECASSPCLHNGRCLDKINEFQCECPTGFTGHLCQYDVDECASTPC 537 

Qy 4i 6 QNGADC DSV-TGKCTCAPGFKG 436 

: | I I I I I I I I I I : I 

Db 538 KNGAKCLDGPNTYTCVCTEGYTGTHCEVDIDECDPDPCHYGSCKDGVATFTCLCRPGYTG 597 

Q y 437 IDCST PCPL GTYGINCS SRCGCKNDAVCS 465 

II III I I I I I I : 

Db 598 HHCETNINECSSQPCRLWGTCQDPDNAYLCFCLKGTTGPNCEINLDDCASSPCDSGTCLD 657 

Qy 466 PVDG-SCTCKAGWHGVDCSIR CPSGTWGFGCNL TC 499 

: I I I I : I : I I : II 

Db 658 KIDGYECACEPGYTGSMCNSNIDECAGNPCHNGGTCEDGINGFTCRCPEGYHDPTCLSEV 717 

Qy 500 QCLNGGACNTLDG-TCTCAPGWRGEKCEL PCQDGTYGLNC 538 

I : : i : : I : I I I I I I I I : : | : | | I : I 

Db 718 NECNSNPCVHGACWDSLNGYKCDCDPGWSGTNCDINNNECESNPCWGGTCKDMTSGIVC 777 

Qy 539 a ERC DCSHADGC-HPTTGH-CRCLPGWSGVHCDSV CAEG- 575 

I I : I I : I I I : : I I : I I I 

Db 778 TCWEGFSGPNCQTNINECASNPCLNKGTCIDDVAGYKCNCLLPYTGATCEWLAPCAPSP 837 

Qy 576 -RWGPNC SLPCYC KNGASCSPDDGICECAPGFRGTTCQRI CS 616 

I I I III | : | | | : | : | : II I 

Db 838 CRNGGECRQSEDYESFSCVCPTAGAKGQTCEVDINECVLSPCWHGASCQNTHGXYRCHCQ 897 

Qy 617 PGFYGHRCSQTCPQC VHSSGPCHH--ITGLCDCLPGFTGALCNE 658 

I : | I I I : I I I i I I I I I I I II 

Db 898 AGYSGRNCETDIDDCWPNPCHNGGSCTDGINTAFCDCLPGFWGTFCEEDINECASDPCRN 957 

Qy 659 VCPSGRFGKNCAG ICT CTNNGTCNPIDR SCQCYPGWI 695 

I I : I I : I I I I I I I I : I : I I I I : 

Db 958 GANCTDCVDSYTCTCPAGFSGIHCENNTPDCTESSCFNGGTC — VDGINSFTCLCPPGFT 1015 

Qy 696 GSDC SQP CPPAHWGPNC IHTCN CHNGAFC 724 

1 I I I : I I I : I I I I : I I : I I I I 

Db 1016 GSYCQHWNECDSRPCLLGGTCQDGRGLHRCTCPQGYTGPNCQNLVHWCDSSPCKNGGKC 1075 



Qy 


725 


Db 


1076 


Qy 


746 


Db 


1134 


Qy 


7 R Q 


Db 


1194 


Qy 


817 


Db 


1254 



-SAYDGECKCTPGWTGLYCTQ R 745 

: I I : i I I I I I II I 



CPLGFYGKDCALI CQ CQNGADC-DHISG-QCTCRTGFMGRHCEQK 7 8 8 

I I : I t : I I I I I I I I I II I : I : I : : 



-CPSGTYGYGCRQICD CLNNSTC-DHITG-T 816 

I I 1 I I I I I I I I I I : I : 



1111:111: I : : I 

CTCPPGFVGERCE GDVN 12 7 0 



RESULT 5 

US-08-083-590A-20 

; Sequence 20, Application US/08083590A 
; Patent No. 5786158 

GENERAL INFORMATION: 
; APPLICANT: Artavanis-Tsakonas, S. et al . 

TITLE OF INVENTION: Therapeutic And Diagnostic Methods 
; TITLE OF INVENTION: And Compositions Based On No. 5786158ch Proteins And 
; TITLE OF INVENTION: Nucleic Acids 

; NUMBER OF SEQUENCES: 21 

; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Pennie & Edmonds 
; STREET: 1155 Avenue of the Americas 

CITY: New York 
; STATE: New York 

; COUNTRY: U.S.A. 

ZIP: 10036 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 08/ 083 , 590A 

FILING DATE: 25-JUN-1993 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
; NAME: Misrock, S. Leslie 

; REGISTRATION NUMBER: 18,872 

REFERENCE/ DOCKET NUMBER: 7326-015 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 212 790-9090 

TELEFAX: 212 8698864/9741 
; TELEX: 66141 PENNIE 

INFORMATION FOR SEQ ID NO: 20: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 2556 amino acids 
; TYPE: amino acid 

STRANDEDNESS : single 



TOPOLOGY: unknown 
MOLECULE TYPE: peptide 
US-08-083-590A-20 



Query Match 15.3%; Score 1034.5; DB 1; Length 2556; 

Best Local Similarity 25.8%; Pred. No. 2.2e-60; 

Matches 316; Conservative 83; Mismatches 304; Indels 523; Gaps 73; 

Qy 94 CCPGFYESGEMCVPHCADKCVHGRC IAPNTCQCEPGWGGTNCSSACDGDH 143 

: i : : I : I = h 'II 

Db 8 9 CALGF--SGPLCLTPLDNACLTNPCRNGGTCDLLTLTEYKCRCPPGWSGKSCQQA 141 

Qy 144 WGPHCTSRCQCKNGALCNP1TGA--CHCAAGFRGWRCE DRCEQG TYGNDCHQ- 193 

II I I I I I : I I I Ml : I I -III 

Db 142 — DPCASN-PCANGGQCLPFEASYICHCPPSFHGPTCRQDVNECGQKPRLCRHGGTCHNE 198 

Qy 194 RC QCQNGATC DHVTGECRCPPGYTGAFCE 222 

M I I 1 I I I II II I Ihll II 

Db 199 VGSYRCVCRATHTGPNCERPYVPCSPSPCQNGGTCRPTGDVTHECACLPGFTGQNCEENI 258 

Qy 223 DLCPPG— KHGPQC EQRCP CQNGGVCHHVTG- 251 

III I : I I II M • I 

Db 2 59 DDCPGNNCKNGGACVDGVNTYNCPCPPEWTGQYCTEDVDECQLMPNACQNGGTCHNTHGG 318 

Qy 252 -ECSCPSGWMGTVCGQ PCPEGRFGKNC— SQEC— 281 

I I : I I I I : I I II I I : I 

Db 319 YNCVCVNGWTGEDCSENIDDCASAACFHGATCHDRVASFYCECPHGRTGLLCHLNDACIS 37 8 

Qy 282 -QCHNGGTCDA--ATGQ — CHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCY 333 

|:||| 1:1111111 i I I : I I : I II 

Db 37 9 NPCNEGSNCDTNPVNGKAICTCPSGYTGPACSQDVDECSLGAN PCEHAGKCI 43 0 

Qy 334 HVSGA--CLCEAGFAGERCEARLCPEGLYGIKCDKRC PCHLENTHSCHPMSGE— CA 386 

: I : I I I : I I I I : I I I : I : I III 

Db 431 NTLGSFECQCLQGYTGPRCEIDV NECVSNPC--QNDATCLDQIGEFQCM 477 

Qy 387 CKPGWSGLYC NE TCSPGFYGEACQ QICS C 415 

I I 1 : I : : I II I I I I I I I : I 

Db 47 8 CMPGYEGVHCEVNTDECASSPCLHNGRCLDKINEFQCECPTGFTGHLCQYDVDECASTPC 537 

Qy 416 QNGADC DSV-TGKCTCAPGFKG 43 6 

: | | | | I I I I I I I : I 

Db 538 KNGAKCLDGPNTYTCVCTEGYTGTHCEVDIDECDPDPCHYGSCKDGVATFTCLCRPGYTG 597 

Qy 437 IDCST PCPL GTYG1NCS SRCGCKNDAVCS 465 

II III I I I I I : I : 

Db 598 HHCETNINECSSQPCRLRGTCQDPDNAYLCFCLKGTTGPNCEINLDDCASSPCDSGTCLD 657 

Qy 466 PVDG-SCTCKAGWHGVDCSIR CPSGTWGFGCNL TC 499 

: I I | | : | : I | : I I I I I II 

Db 658 KIDGYECACEPGYTGSMCNSNIDECAGNPCHNGGTCEDGINGFTCRCPEGYHDPTCLSEV 717 

Qy 500 QCLNGGACNTLDG-TCTCAPGWRGEKCEL 527 

I : : I : : I : I I I I I I I I : : 
Db 718 NECNSNPCVHGACRDSLNGYKCDCDPGWSGTNCDINNNECESNPCVNGGTCKDMTSGIVC 777 

Qy 528 PCQDGTYGLNCAERCD CSHADGC-HPTTGH-CRCLPGWSGVHCDSV CAEG- 575 



|::| Ml : I : I hill ::| I: I M 

Db 778 TCREGFSGPNCQTNINECASNPCLNKGTCIDDVAGYKCNCLLPYTGATCEWLAPCAPSP 837 

Qy 576 -RWGPNC SLPCYC KNGASCSPDDGICECAPGFRGTTCQRI C>S 616 

Ml III | : | | | : | | : I | I 

Db 838 CRNGGECRQSEDYESFSCVCPTAGAKGQTCEVDINECVLSPCRHGASCQNTHGGYRCHCQ 897 

Qy 617 PGFYGHRCSQTCPQCVHSSGPCHH ITGLCDCLPGFTGALCNE 658 

I : | | I III: 

Db 898 AGYSGRNCETDIDDC— RPNPCHNGGSCTDGINTAFCDCLPGFRGTFCEEDINECASDPC 955 

Q y 659 VCPSGRFGKNCAG ICT CTNNGTCNPIDR SCQCYPG 693 

I I : I I : I II I I I I I : I =1111 

Db 956 RNGANCTDCVDSYTCTCPAGFSGIHCENNTPDCTESSCFNGGTC — VDGINSFTCLCPPG 1013 

Qy 694 WIGSDC SQP CPPAHWGPNC IHTCN CHNGA 722 

: | | | | : | I I : I t I I : I I : III 

Db 1014 FTGSYCQHWNECDSRPCLLGGTCQDGRGLHRCTCPQGYTGPNCQNLVHWCDSSPCKNGG 1073 

Qy 723 FC SAYDGECKCTPGWTGLYCTQ 744 

I : I I : I I I I I I I I 

Db 1074 KCWQTHTQY- - RCEC P S GWT GLYCDVP S VSCEVAAQRQGVDVARLCQHGGLCVDAGNTHH 1131 



QY 



745 -RCPLGFYGKDCALI CQ CQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

|| | : | I : I I I I I I I I := I I I I : I : I : : 

Db 1132 CRCQAGYTGSYCEDLVDECSPSPCQNGATCTDYLGGYSCKCVAGYHGVNCSEEIDECLSH 1191 



Q y 789 CPSGTYGYGCRQICD CLNNSTC-DHITG 815 

I I I I I I I I I I I I I : I 

Db 1192 PCQNGGTCLDLPNTYKCSCPRGTQGVHCEINVDDCNPPVDPVSRSPKCFNNGTCVDQVGG 1251 

Qy 816 - T C Y C S P GW K GARC D Q AG VI I VGN LN 84 0 

: I I II : I I I : I : : I 

Db 1252 YSCTCPPGFVGERCE GDVN 1270 



RESULT 6 

US-08-532-384-20 

; Sequence 20, Application US/08532384 

; Patent No. 6083904 

; GENERAL INFORMATION: 

APPLICANT: Artavani s-Tsakonas , S. et al . 
; TITLE OF INVENTION: Therapeutic And Diagnostic Methods 

TITLE OF INVENTION: And Compositions Based On No. 6083904ch Proteins And 
; TITLE OF INVENTION: Nucleic Acids 

; NUMBER OF SEQUENCES: 21 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Pennie & Edmonds 

; STREET: 1155 Avenue of the Americas 

; CITY: New York 

; STATE: New York 

; COUNTRY: U.S.A. 

ZIP: 10036 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC~DOS /MS-DOS 



; SOFTWARE: Patentln Release #1.0, Version #1.25 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/532 , 384 

FILING DATE: 
; CLASSIFICATION: 424 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/083,590 
; FILING DATE: 25-JUN-1993 

ATTORNEY/ AGENT INFORMATION: 
; NAME : Misrock, S. Leslie 

REGISTRATION NUMBER: 18,872 

REFERENCE/ DOCKET NUMBER: 7326-015 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 212 790-9090 

TELEFAX: 212 8698864/9741 
; TELEX: 66141 PENNIE 

; INFORMATION FOR SEQ ID NO: 20: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 2556 amino acids 

TYPE: amino acid 
; STRANDEDNESS : single 

; TOPOLOGY: unknown 

MOLECULE TYPE: peptide 
US-08-532-384-20 

Query Match 15.3%; Score 1034.5; DB 3; Length 2556; 

Best Local Similarity 25.8%; Pred. No. 2.2e-60; 

Matches 316; Conservative 83; Mismatches 304; Indels 523; Gaps 73; 

94 CCPGFYESGEMCVPHCADKCVHGRC IAPNTCQCEPGWGGTNCSSACDGDH 143 

I II ||:|: : I : I : I = : I I 

8 9 CALGF— SGPLCLTPLDNACLTNPCRNGGTCDLLTLTEYKCRCPPGWSGKSCQQA 141 

144 WGPHCTSRCQCKNGALCNP.ITGA--CHCAAGFRGWRCE DRCEQG TYGNDCHQ- 193 

I I I I I I I : I I I III : I I : I I I 

142 — DPCASN-PCANGGQCLPFEASYICHCPPSFHGPTCRQDVNECGQKPRLCRHGGTCHNE 198 

194 rc QCQNGATC DHVTGECRCPPGYTGAFCE 222 

|| I I I I I I i I II I I I : I I I I 

19 9 VGS YRCVCRATHTGPNCERP YVPCS PS PCQNGGTCRPTGDVTHECACLPGFTGQNCEENI 2 5 8 

223 DLCPPG— KHGPQC EQRCP CQNGGVCHHVTG- 251 

| || 1:11 M Mill II: I 

259 DDCPGNNCKNGGACVDGVNTYNCPCPPEWTGQYCTEDVDECQLMPNACQNGGTCHNTHGG 318 

252 -ECSCPSGWMGTVCGQ PCPEGRFGKNC SQEC 281 

I I :ll I I : Mill I : I 

319 YNCVCVNGWTGEDCSENIDDCASAACFHGATCHDRVASFYCECPHGRTGLLCHLNDACIS 37 8 

2 82 -QCHNGGTCDA— ATGQ— CHCSPGYTGERCQ — DECPVGTYGVLCAETCQCVNGGKCY 333 

| : | | | I : I I I I II I Ml M I : I II 

379 NPCNEGSNCDTNPVNGKAICTCPSGYTGPACSQDVDECSLGAN PCEHAGKCI 430 

334 HVSGA— CLCEAGFAGERCEARLCPEGLYGIKCDKRC — PCHLENTHSCHPMSGE— CA 38 6 

: I : | | I : I I I I : I I I M M I I I 

431 NTLGSFECQCLQGYTGPRCEIDV NECVSNPC--QNDATCLDQIGEFQCM 477 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



i 



Qy 387 CKPGWSGLYC NE TCSPGFYGEACQ QICS C 415 

| | | : | : : | II I I I I I I I : I 

Db 478 CMPGYEGVHCEVNTDECASSPCLHNGRCLDKINEFQCECPTGFTGHLCQYDVDECASTPC 537 

Qy 416 QNGADC DSV-TGKCTCAPGFKG 4 36 

: I I I I I I I I I II: I 

Db 538 KNGAKCLDGPNTYTCVCTEGYTGTHCEVDIDECDPDPCHYGSCKDGVATFTCLCRPGYTG 597 

Qy 437 IDCST PCPL GTYGINCS SRCGCKNDAVCS 465 

II III I I I I I I : 

Db 598 HHCETNINECSSQPCRLRGTCQDPDNAYLCFCLKGTTGPNCEINLDDCASSPCDSGTCLD 657 

Qy 466 PVDG-SCTCKAGWHGVDCSIR CPSGTWGFGCNL TC 499 

: I I I I : I : I I : II 

Db 658 KIDGYECACEPGYTGSMCNSNIDECAGNPCHNGGTCEDGINGFTCRCPEGYHDPTCLSEV 717 

Qy 500 QCLNGGACNTLDG-TCTCAPGWRGEKCEL 527 

I :: I :: I : I I I I I I I I 
Db 718 NECNSNPCVHGACRDSLNGYKCDCDPGWSGTNCDINNNECESNPCVNGGTCKDMTSGIVC 777 

Qy 52 8 PCQDGTYGLNCAERCD CSHADGC-HPTTGH-CRCLPGWSGVHCDSV CAEG- 57 5 

| : : | I I I : I : I I : I I I : : I I : i I I 

Db 778 TCREGFSGPNCQTNINECASNPCLNKGTCIDDVAGYKCNCLLPYTGATCEWLAPCAPSP 837 

Qy 576 -RWGPNC SLPCYC KNGASCSPDDGICECAPGFRGTTCQRI CS 616 

III III I : I I I : I I : I I I 

Db 838 CRNGGECRQSEDYESFSCVCPTAGAKGQTCEVDINECVLSPCRHGASCQNTHGGYRCHCQ 897 

Qy 617 PGFYGHRCSQTCPQCVHSSGPCHH ITGLCDCLPGFTGALCNE 658 

I : I I I Ml: I I I I I I I I I I I 

Db 898 AGYSGRNCETDIDDC--RPNPCHNGGSCTDGINTAFCDCLPGFRGTFCEEDINECASDPC 955 

Qy 659 VC P S GRFGKN CAG 1 CT CTNNGT CN P I DR SCQCYPG 693 

I I : I I : I I I I i I I S : I Mill 

Db 956 RNGANCTDCVDSYTCTCPAGFSGIHCENNTPDCTESSCFNGGTC~-VDGINSFTCLCPPG 1013 

Qy 694 WIGSDC SQP CPPAHWGPNC IHTCN CHNGA 722 

: | | I I : I I I : I I I I : I I : III 

Db 1014 FTGSYCQHWNECDSRPCLLGGTCQDGRGLHRCTCPQGYTGPNCQNLVHWCDSSPCKNGG 1073 

Qy 723 FC SAYDGECKCTPGWTGLYCTQ 744 

I : I I : I lllll 1 

Db 1074 KCWQTHTQ Y- - RCECP S GWTGLYCDVP S VS CEVAAQRQGVDVARLCQHGGLCVD AGNTHH 1131 

Qy 745 -RCPLGFYGKDCALI CQ CQNGADC-DHI S G-QCTCRTGFMGRHCEQK 788 

II | : | I : I I I I I I I I :: I II I : I : I : : 

Db 1132 CRCQAGYTGSYCEDLVDECSPSPCQNGATCTDYLGGYSCKCVAGYHGVNCSEEIDECLSH 1191 

Qy 789 CPSGTYGYGCRQICD CLNNSTC-DHITG 815 

I I I I I I I I I I i I i : I 

Db 1192 PCQNGGTCLDLPNTYKCSCPRGTQGVHCEINVDDCNPPVDPVSRSPKCFNNGTCVDQVGG 1251 

Qy 816 - T C YC S P GWKGARCDQAGVI I VGNLN 84 0 

: I I I I : I I I : I : : I 

Db 1252 YSCTCPPGFVGERCE GDW 127 0 



RESULT 7 

US-08-185-432-16 

Sequence 16, Application US/08185432 
Patent No. 5750652 
GENERAL INFORMATION: 

APPLICANT : Artavanis -Tsakonas , Spyridon 
APPLICANT: Busseau, Isabelle 
APPLICANT: Diederich, Robert J. 
APPLICANT: Xu, Tian 
APPLICANT: Matsuno, Kenji 

TITLE OF INVENTION: DELTEX PROTEINS, NUCLEIC ACIDS, AND 
TITLE OF INVENTION: ANTIBODIES, AND RELATED METHODS AND COMPOSITIONS 
NUMBER OF SEQUENCES: 23 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: PENNIE & EDMONDS 
STREET: 1155 Avenue of the Americas 
CITY: New York 
STATE: New York 
COUNTRY: U.S.A. 
ZIP: 10036-2711 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/185,432 
FILING DATE: 21-JAN-1994 
CLASSIFICATION: 53 0 
ATTORNEY/ AGENT INFORMATION: 
NAME: Misrock, S. Leslie 
REGISTRATION NUMBER: 18,872 
REFERENCE/DOCKET NUMBER: 7326-006 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (212) 790-9090 
TELEFAX: (212) 8 69-88 64/9741 
TELEX: 66141 PENNIE 
INFORMATION FOR SEQ ID NO: 16: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2471 amino acids 
TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
US-08-185-432-16 

Query Match 15.0%; Score 1014.5; DB 1; Length 2471; 

Best Local Similarity 23.0°o; Pred. No. 4.5e-59; 

Matches 348; Conservative 106; Mismatches 354; Indels 707; Gaps 78; 

Qy 93 QCCPGFYESGEMCVPHCADKCVHGR-CIAPNTCQ CEPGWGGTNCSSACDG 141 

: I I I : II I : I I I : II | : | : | | 
Db 91 RCASGF — TGEDCQYSTSHPCFVSRPCLNGGTCHMLSRDTYECTCQVGFTGKEC — 142 

Qy 142 DHWGPHCTSRCQCKNGALCNPITG— ACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQN 199 

I || | | 1 : | : : I I II I : I I I I : I I I : 

Db 143 -QWTDACLSH-PCANGSTCTTVANQFSCKCLTGFTGQKCE TDVNECDIPGHCQH 194 



Qy 200 GATCDHVTG— ECRCPPGYTGAFCEDL- CPPGKHG 231 

I I I : : I : I : I I I : I I : I : I I I I I 

Db 195 GGTCLNLPGSYQCQCPQGFTGQYCDSLYVPCAPSPCVNGGTCRQTGDFTFECNCLPGFEG 254 

Qy 232 PQCEQ RCP CQNGGV 245 

II: III I I I I I 

Db 255 STCERNIDDCPNHRCQNGGVCVDGVNTYNCRCPPQWTGQFCTEDVDECLLQPNACQNGGT 314 

Qy 246 CHHVTG — ECSCPSGWMGTVCGQ PCPEGRFGKNC- 277 

I : I I I : I I I I : Nihil 

Db 315 CANRNGGYGCVCVNGWSGDDCSENIDDCAFASCTPGSTCIDRVASFSCMCPEGKAGLLCH 374 

Qy 278 -SQEC QCHNGGTCDA — ATGQ--CHCSPGYTGERCQ DECPVGTYGVLCAETCQC 326 

I I I I II II II III I IN: I : I 

Db 375 LDDACISNPCHKGALCDTNPLNGQYICTCPQGYKGADCTEDVDECAM ANSNPC 427 

Qy 327 VNGGKCYHVSGA — CLCEAGFAGERCE ARLCPEGLY 360 

: I I I : I I I I I : I I I I I ill 

Db 42 8 EHAGKCVNTDGAFHCECLKGYAGPRCEMDINECHSDPCQNDATCLDKIGGFTCLCMPGFK 4 87 

Qy 361 GIKCDKR CP CHLE NTHSC 378 

|:|: II I : : II 

Db 488 GVHCELEINECQSNPCWNGQCVDKWRFQCLCPPGFTGPVCQIDIDDCSSTPCLNGAKC 547 

Qy 379 —HPMSGECACKPGWSGLYCNET CSPGFYGEAC-QQ 411 

II | | | | : : | : | I 1-11= I I I 

Db 548 IDHPNGYECQCATGFTGVLCEENIDNCDPDPCHHGQCQDGIDSYTCICNPGYMGAICSDQ 607 

Qy 412 I CSCQNGA DCDS VTG K 427 

I hill 1 I I : I 

Db 608 IDECYSSPCLNDGRCIDLVNGYQCNCQPGTSGVNCEINFDDCASNPCIHGICMDGINRYS 667 

Qy 428 CTCAPGFKGIDC STPCPLGTYGINCSS — RCGCK NDAVCS 465 

IINIIII I I I I I I : I I I h : : 

Db 668 CVCSPGFTGQRCN1DIDECASNPCRKGATCINGVNGFRCICPEGPHHPSCYSQVNECLSN 727 

Qy 4 66 P-VDGSCT CKAGWHGVDCSI RCPSGT 490 

I s 1 = 11 I I I I h : I : II 

Db 728 PCIHGNCTGGLSGYKCLCDAGWVGINCEVDKNECLSNPCQNGGTCDNLVNGYRCTCKKGF 787 

Qy 491 WGFGCNLTCQ CLNGGAC 507 

I : I : I I I I I 

Db 788 KGYNCQVNIDECASNPCLNQGTCFDDISGYTCHCVLPYTGKNCQTVLAPCSPNPCENAAV 847 

Qy 508 NTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGH — CRCL 560 

I | | | | | | | : | : : | : : | : I : I I I I II 

Db 848 CKESPNFESYTCLCAPGWQGQRCTIDIDE CISK-PCMNHGLCHNTQGSYMCECP 900 

Qy 561 PGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGI CECAPGFRGTTCQR 613 

I h I h h : l 1 = 11 II I h I l I I I i I I 

Db 901 PGFSGMDCEEDIDDCLANP CQNGGSCM--DGVNTFSCLCLPGFTGDKCQTDMN 951 

Qy 614 ICSPGFYGHRCSQTCPQCVHSS GPCHHITGL 644 

I II ! I Nil | | : | : 

Db 952 ECLSEPCKNGGTCSDYVNSYTCKCQAGFDGVHCENNIMECTESSCFNGGTC — VDGINSF 1009 



Qy 



645 -CDCLPGFTGALC NF.V CPSGRFGKNC AGICT- 67 4 



I llllhl I! M 1 I I I I : I : 

Db 1010 SCLCPVGFTGSFCLHEINECSSHPCLNEGTCVDGLGTYRCSCPLGYTGKNCQTLVNLCSR 1069 

Qy 675 --CTNNGTC--NPIDRSCQCYPGWIGSDCSQP 702 

I I I I I : II 1 I I : I I 

Db 1070 SPCKNKGTCVQKKAF.SQCLCPSGWAGAYCDVPNVSCDIAAS RRGVLVEHLCQHSGVCINA 1129 

Qy 7 03 CPPAHWGPNC IHTC NCHNGAFCSAYDG--ECKCTPGWTGLYCTQR-- 745 

i | : | | : I I : II I I : I I : I I I : I : I 

Db 1130 GNTHYCQCPLGYTGSYCEEQLDECASNPCQHGATCSDFIGGYRCECVPGYQGVNCEYEVD 1189 

Q y 746 CPLGFYG KDCALICQCQNGADC-DHI SGQ- 773 

I I I I I I I I I I I I I I 

Db 1190 ECQNQPCQNGGTCIDLVNHFKCSCPPGTRGLLCEENIDDCARGPHCLNGGQCMDRIGGYS 1249 

Q y 774 CTCRTGFMGRHCEQKCPSGT 793 

I II: I Mill I 

Db 1250 CRCLPGFAGERCEGDINECLSNPCSSEGSLDCIQLTNDYLCVCRSAFTGRHCE T 1303 

Qy 794 YGYGCRQICDCLNNSTCDHITG T C YC S P GWKGARCDQAGVT I VGN LN S L S RT S T A 848 

: I I : I! I I I : I I I I : i I II : I = 

Db 1304 FVDVCPQM-PCLNGGTCAVASNMPDGFICRCPPGFSGARCQSS CGQVKC 1351 

Qy 849 LPADSYQIGAIAGIIILVLWLFLLALFI IYRHKQKGKE S SMPAVT YT PAMRWNA 904 

: I I : : : I I : I : I : 

Db 1352 RKGEQCVHTASGPR-CFCPSPRDCES 1376 

Qy 905 DYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKN 964 

I I : I I : I I I :: I I I I : I I 

Db 1377 GCASS PCQHGGSC— HPQRQPPYYS-CQCA-PPFSGSR CELYT 1415 

Qy 965 VNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNS EYNSSNC 1021 

III II I :| || ::: :| 

Db 1416 APPST — PPATCLSQYCADKARDGVCDE ACNSHACQWDGGDC 1455 

Qy 1022 SLSSSENPYATIKDP 1036 

II : I I I : I I 
Db 1456 SL-TMENPWANCSSP 1469 



RESULT 8 

US-08-083-590A-19 

; Sequence 19, Application US/08083590A 
; Patent No. 5786158 

GENERAL INFORMATION: 
; APPLICANT: Artavani s-Ts akonas , S. et al . 

; TITLE OF INVENTION: Therapeutic And Diagnostic Methods 

; TITLE OF INVENTION: And Compositions Based On No. 5786158ch Proteins And 

TITLE OF INVENTION: Nucleic Acids 
; NUMBER OF SEQUENCES: 21 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Pennie & Edmonds 

STREET: 1155 Avenue of the Americas 
; CITY: New York 

; STATE: New York 

COUNTRY: U.S.A. 
; ZIP: 10036 



COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS /MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 0 83 , 5 90A 

FILING DATE: 25-JUN-1993 

CLASSIFICATION: 435 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Misrock, S. Leslie 

REGISTRATION NUMBER: 18,872 

REFERENCE/ DOCKET NUMBER: 7326-015 
TELECOMMUNICATION INFORMATION : 
; TELEPHONE: 212 790-9090 

TELEFAX: 212 8698864/9741 

TELEX: 66141 PENNIE 
; INFORMATION FOR SEQ ID NO: 19: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2471 amino acids 
; TYPE: amino acid 

; STRANDEDNESS : single 

TOPOLOGY: unknown 
MOLECULE TYPE: peptide 
US-08-083-590A-19 



Query Match 15.0%; Score 1014.5; DB 1; Length 2471; 

Best Local Similarity 23.0%; Pred. No. 4.5e-59; 

Matches 348; Conservative 106; Mismatches 354; Indels 707; Gaps 



Qy 


93 


QCCPGFYESGEMCVPHCADKCVHGR-CIAPNTCQ CEPGWGGTNCSSACDG 

: I I I : 1 1 1 : 1 1 1 : II | : | : | I 
RCASGF--TGEDCQYSTSHPCFVSRPCLNGGTCHMLSRDTYECTCQVGFTGKEC 


141 


Db 


91 


142 


Qy 


142 


DHWGPHCTSRCQCKNGALCNPITG— ACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQN 


199 


Db 


143 


| || I I I : I : : 1 1 II 1 : 1 1 1 1 : 1 : 

-QWTDACLSH-PCANGSTCTTVANQFSCKCLTGFTGQKCE TDVNECDIPGHCQH 


194 


Qy 


200 


GATCDHVTG— ECRCPPGYTGAFCEDL CPPGKHG 

1 1 1 : : 1 : I : 1 1 1 : 1 1 : 1 : 1 1 1 1 1 

GGTCLNLPGSYQCQCPQGFTGQYCDSLYVPCAPSPCVNGGTCRQTGDFTFECNCLPGFEG 


231 


Db 


195 


254 


Qy 


232 


PQCEQ RCP CQNGGV 

M: II 1 Mill 
STCERNIDDCPNHRCQNGGVCVDGVNTYNCRCPPQWTGQFCTEDVDECLLQPNACQNGGT 


245 


Db 


255 


314 


Qy 


246 




277 


Db 


315 


1:1 11=111 1 ' lllhll 
CANRNGGYGCVCVNGWSGDDCSENIDDCAFASCTPGSTCIDRVASFSCMCPEGKAGLLCH 


374 


Qy 


278 


-SQEC QCHNGGTCDA--ATGQ--CHCSPGYTGERCQ DECPVGTYGVLCAETCQC 

I Mill II I 1 1 1 1 1 Ml: 1 : 1 


326 


Db 


375 


LDDACISNPCHKGALCDTNPLNGQYICTCPQGYKGADCTEDVDECAM ANSNPC 


427 


Qy 


327 


VNGGKCYHVSGA- -CLCEAGFAGERCE ARLCPEGLY 

: M 1 : II II 1 = 11 III Ml 
EHAGKCVNTDGAFHCECLKGYAGPRCEMDINECHSDPCQNDATCLDKIGGFTCLCMPGFK 


360 


Db 


428 


487 



Qy 361 GIKCDKR CP CHLE NTHSC 378 

|:|: II I :: I I 

Db 488 GVHCELEINECQSNPCVNNGQCVDKVNRFQCLCPPGFTGPVCQIDIDDCSSTPCLNGAKC 547 

Qy 379 --HPMSGECACKPGWSGLYCNET CSPGFYGEAC-QQ 411 

II I I I I : : I : I I 1 : I I : I I I 

Db 54 8 IDHPNGYECQCATGFTGVLCEENIDNCDPDPCHHGQCQDGIDSYTCICNPGYMGAICSDQ 607 

Qy 412 I CSCQNGA DCDS VTG K 427 

I I : I I I I I I : I 

Db 608 IDECYSSPCLNDGRCIDLVNGYQCNCQPGTSGVNCEINFDDCASNPCIHGICMDGINRYS 667 

Qy 428 CTCAPGFKGI DC STPCPLGTYGINCSS--RCGCK NDAVCS 465 

11:11111 I I I I I I : I I I I : : : 

Db 668 CVCSPGFTGQRCNIDIDECASNPCRKGATCINGVNGFRCICPEGPHHPSCYSQVNECLSN 727 

Qy 466 P-VDGSCT CKAGWHGVDCSI RCPSGT 490 

I : I : I I I I I I I : : I : II 

Db 72 8 PCIHGNCTGGLSGYKCLCDAGWVGINCEVDKNECLSNPCQNGGTCDNLVNGYRCTCKKGF 7 87 

Qy 4 91 WGFGCNLTCQ CLNGGAC 5 07 

I : I : Mill 
Db 788 KGYNCQVNIDECASNPCLNQGTCFDDI SGYTCHCVLPYTGKNCQTVLAPCSPNPCENAAV 8 47 

Qy 508 NTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGH— CRCL 560 

I I I I I I I I : I : : I : : I : I : I I I I I I 

Db 848 CKESPNFESYTCLCAPGWQGQRCTIDIDE CISK-PCMNHGLCHNTQGSYMCECP 900 

Qy 561 PGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGI CECAPGFRGTTCQR 613 

| | : I ! : I : : I Mill I I : I I I I I I I I 

Db 901 PGFSGMDCEEDIDDCLANP CQNGGSCM — DGVNTFSCLCLPGFTGDKCQTDMN 951 

Qy 614 ICSPGFYGHRCSQTCPQCVHSS GPCHHITGL 644 

I I I I I : I I I I I : I : 

Db 952 ECLSEPCKNGGTCSDYVNSYTCKCQAGFDGVHCENNINECTESSCFNGGTC— VDGINSF 1009 

Qy 645 -CDCLPGFTGALC NEV CPSGRFGKNC AGICT- 674 

I I MM: I M II I MM :|: 

Db 1010 SCLCPVGFTGSFCLHEINECSSHPCLNEGTCVDGLGTYRCSCPLGYTGKNCQTLVNLCSR 1069 

Qy 675 --CTNNGTC--NPIDRSCQCYPGWIGSDCSQP 702 

I I II I : II I I I : I I 

Db 1070 SPCKNKGTCVQKKAESQCLCPSGWAGAYCDVPNVSCDIAASRRGVLVEHLCQHSGVCINA 1129 

Qy 703 CPPAHWGPNC 1HTC NCHNGAFCSAYDG — ECKCTPGWTGLYCTQR — 745 

M : I I : I I M I M : I I : I M : I : I 

Db 1130 GNTHYCQCPLGYTGSYCEEQLDECASNPCQHGATCSDFIGGYRCECVPGYQGVNCEYEVD 1189 

Qy 746 CPLGFYG KDCALICQCQNGADC-DHI SGQ- 7 73 

II I I III I I I I I I I 

Db 1190 ECQNQPCQNGGTCIDLVNHFKCSCPPGTRGLLCEENIDDCARGPHCLNGGQCMDRIGGYS 1249 

Qy 774 CTCRTGFMGRHCEQKCPSGT 793 

i I I : I I II II I 

Db 1250 CRCLPGFAGERCEGDINECLSNPCSSEGSLDCIQLTNDYLCVCRSAFTGRHCE-- T 1303 



Qy 794 YGYGCRQICDCLNNSTCDHITG TCYCSPGWKGARCDQAGVI IVGNLNSLSRTSTA 848 

I I: III II : 1111:1111: I : 
Db 1304 FVD VC P QM- P C LN GGT C AVAS NM PDGFICRCPPGFS GAR C Q S S CGQVKC 1351 

Qy 849 L PAD S YQ I G A I AG III LVLWL F L L AL F 1 1 Y RH KQ K G K E S SMPAVTYTPAMRWNA 904 

: II : : : I I : I : I : 

Db 1352 RKGEQCVHTASGPR-CFCPSPRDCES 1376 

Qy 905 DYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHWNRDRMTWKSKNNQLFWLKN 964 

I I : I I : I | | :: I I I I : I I 
Db 1377 GCASS PCQHGGSC — HPQRQPPYYS-CQCA-PPFSGSR CELYT 1415 

Qy 965 VNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNS EYNSSNC 1021 

I II II I :| M — :| 

Db 1416 APPST— PPATCLSQYCADKARDGVCDE ACNSHACQWDGGDC 1455 

Qy 1022 SLSSSENPYATIKDP 1036 

I I : I I I : I I 
Db 1456 SL-TMENPWANCSSP 1469 



RESULT 9 

US-08-532-384-19 

; Sequence 19, Application US/08532384 
; Patent No. 6083904 

GENERAL INFORMATION: 
; APPLICANT: Artavanis-Tsakonas, S. et al . 

; TITLE OF INVENTION: Therapeutic And Diagnostic Methods 

TITLE OF INVENTION: And Compositions Based On No. 6083904ch Proteins And 
; TITLE OF INVENTION: Nucleic Acids 

; NUMBER OF SEQUENCES: 21 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Pennie & Edmonds 
; STREET: 1155 Avenue of the Americas 

; CITY: New York 

; STATE: New York 

COUNTRY: U.S.A. 

ZIP: 10036 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/532,384 

FILING DATE: 

CLASSIFICATION: 424 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/083,590 

FILING DATE: 25-JUN-1993 
ATTORNEY/AGENT INFORMATION: 
; NAME: Misrock, S. Leslie 

REGISTRATION NUMBER: 18,872 

REFERENCE/DOCKET NUMBER: 7326-015 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE : 212 790-9090 

TELEFAX : 212 8698864/9741 



TELEX: 66141 PENNIE 
; INFORMATION FOR SEQ ID NO: 19: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2471 amino acids 

TYPE: amino acid 
; STRANDEDNESS : single 

TOPOLOGY: unknown 
; MOLECULE TYPE: peptide 

US-08-532-384-19 

Query Match 15.0%; Score 1014.5; DB 3; Length 2471; 

Best Local Similarity 23.0%; Pred. No. 4.5e-59; 

Matches 348; Conservative 106; Mismatches 354; Indels 707; Gaps 78; 



Qy 


93 


QCCPGFYESGEMCVPHCADKCVHGR-CIAPNTCQ CEPGWGGTNCSSACDG 

: 1 1 1 : i 1 i : 1 1 1 : II 1 : 1 : 1 1 


141 


Db 


91 


RCASGF — TGEDCQYSTSHPCFVSRPCLNGGTCHMLSRDTYECTCQVGFTGKEC 


142 


Qy 


142 


DHWGPHCTSRCQCKNGALCNPITG — ACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQN 
1 II 1 1 1 : 1 : : 1 1 1 1 1 : 1 1 1 1 : 1 1 1 : 
-QWTDACLSH-PCANGSTCTTVANQFSCKCLTGFTGQKCE TDVNECDI PGHCQH 


199 


Db 


143 


194 


Qy 


200 


GATCDHVTG — ECRCPPGYTGAFCEDL CPPGKHG 

1 1 1 : : 1 : 1 : 1 1 1 : 1 1 : i : 1 1 1 1 1 
GGTCLNLPGSYQCQCPQGFTGQYCDSLYVPCAPSPCVNGGTCRQTGDFTFECNCLPGFEG 


231 


Db 


195 


254 


Qy 


232 


PQCEQ RCP CQNGGV 

II: 1 II 1 1 1 1 1 
STCERNIDDCPNHRCQNGGVCVDGVNTYNCRCPPQWTGQFCTEDVDECLLQPNACQNGGT 


245 


Db 


255 


314 


Qy 


246 


CHHVTG — ECSCPSGWMGTVCGQ PCPEGRFGKNC- 

1 : 1 1 1 : 1 1 1 1 : Nihil 
CANRNGGYGCVCVNGWSGDDCSENIDDCAFASCTPGSTCIDRVASFSCMCPEGKAGLLCH 


277 


Db 


315 


374 


Qy 


278 


-SQEC QCHNGGTCDA — ATGQ--CHCS PGYTGERCQ DECPVGTYGVLCAETCQC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 III: 1:1 

LDDACISNPCHKGALCDTNPLNGQYICTCPQGYKGADCTEDVDECAM ANSNPC 


326 


Db 


375 


427 


Qy 


327 


VNGGKCYHVSGA — CLCEAGFAGERCE ARLCPEGLY 

: III : II II 1:11 Ml 1 
EHAGKCVNTDGAFHCECLKGYAGPRCEMDINECHSDPCQNDATCLDKIGGFTCLCMPGFK 


360 


Db 


428 


487 


Qy 


361 


GIKCDKR CP CHLE NTHSC 


378 


Db 


488 


I : I : II 1 : : II 

GVHCELEINECQSNPCVNNGQCVDKVNRFQCLCPPGFTGPVCQIDIDDCSSTPCLNGAKC 


547 


Qy 


379 


— HPMSGECACKPGWSGLYCNET CSPGFYGEAC-QQ 

II | | | | : : | : | | 1 : 1 1 : 1 1 1 
IDHPNGYECQCATGFTGVLCEENIDNCDPDPCHHGQCQDGIDSYTCICNPGYMGAICSDQ 


411 


Db 


548 


607 


Qy 


412 


X CSCQNGA DCDS VTG K 

I 1:111 1 1 1 : 1 

IDECYSSPCLNDGRCIDLVNGYQCNCQPGTSGVNCEINFDDCASNPCIHGICMDGINRYS 


427 


Db 


608 


667 


Qy 


428 


CTCAPGFKGIDC STPCPLGTYGINCSS — RCGCK NDAVCS 


465 


Db 


668 


1 I : 1 1 1 1 1 1 1 1 1 11:111 1 : : = 
CVCSPGFTGQRCNIDIDECASNPCRKGATCINGVNGFRCICPEGPHHPSCYSQVNECLSN 


727 



Qy 466 P-VDGSCT CKAGWHGVDCSI RCPSGT 490 

I : I : I I I I I I I : : I : II 

Db 728 PCIHGNCTGGLSGYKCLCDAGWVGINCEVDKNECLSNPCQNGGTCDNLVNGYRCTCKKGF 787 

Qy 491 WGFGCNLTCQ CLNGGAC 507 

I : I : 1 I I I I 

Db 788 KGYNCQWIDECASNPCLNQGTCFDDISGYTCHCVLPYTGKNCQTVLAPCSPNPCENAAV 847 

Qy 508 NTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGH--CRCL 560 

I I I I I I I I : I : : I : = | : | : I I I I II 

Db 848 CKESPNFESYTCLCAPGWQGQRCTIDIDE CISK-PCMNHGLCHNTQGSYMCECP 900 

Qy 561 PGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGI CECAPGFRGTTCQR 613 

| | : M : | : : I I : I I I I I I : I I Ml I II 

Db 901 PGFSGMDCEED1DDCLANP CQNGGSCM--DGVNTFSCLCLPGFTGDKCQTDMN 951 

Qy 614 ICSPGFYGHRCSQTCPQCVHSS GPCHHITGL 644 

I I I I I : I I I | I : I : 

Db 952 ECLSEPCKNGGTCSDYVNSYTCKCQAGFDGVHCENNINECTESSCFNGGTC— VDGINSF 1009 

Qy 645 -CDCLPGFTGALC NEV CPSGRFGKNC— AGICT- 674 

i I I I I I : i II : I : 

Db 1010 SCLCPVGFTGSFCLHEINECSSHPCLNEGTCVDGLGTYRCSCPLGYTGKNCQTLVNLCSR 1069 

Qy 675 — CTNNGTC — NPIDRSCQCYPGWIGSDCSQP 702 

I I I I I : II I I I : i I 

Db 1070 SPCKNKGTCVQKKAESQCLCPSGWAGAYCDVPNVSCDIAASRRGVLVEHLCQHSGVCINA 1129 

Qy 703 CPPAHWGPNC IHTC NCHNGAFCSAYDG--ECKCTPGWTGLYCTQR-- 745 

| | : I ! : I I : I I I I : I I : I I I : I : I 

Db 1130 GNTHYCQCPLGYTGSYCEEQLDECASNPCQHGATCSDFIGGYRCECVPGYQGVNCEYEVD 1189 

Qy 746 CPLGFYG KDCALICQCQNGADC-DHI SGQ- 773 

I I I I III I I I I I I I 

Db 1190 ECQNQPCQNGGTCIDLWHFKCSCPPGTRGLLCEENIDDCARGPHCLNGGQCMDRIGGYS 1249 

Q y 774 CTCRTGFMGRHCEQKCPSGT 7 93 

111:111111 I 

Db 1250 CRCLPGFAGERCEGDINECLSNPCSSEGSLDCIQLTNDYLCVCRSAFTGRHCE T 1303 

Qy 794 YGYGCRQICDCLNNSTCDHITG T C YC S P GWKGARCDQAGVI I VGNLN S L S RT S T A 848 

: I I : I I I I I : I I I I : I I I I : I • 

Db 1304 FVDVCPQM-PCLNGGTCAVASNMPDGFICRCPPGFSGARCQSS CGQVKC 1351 

Qy 849 LPADS YQI GAIAGI 1 1 LVLWLFLLALFI I YRHKQKGKE S SMPAVT YT PAMRWNA 904 

: I I : : : I I : I : I : 

Db 1352 RKGEQCVHTASGPR-CFCPSPRDCES 137 6 

Qy 905 DYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKN 964 

I I : I I : I | | : : I | | I : I I 

Db 1377 GCASS PCQHGGSC — HPQRQPPYYS-CQCA-PPFSGSR CELYT 1415 

Qy 965 VNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRS YMGKSLKDLGKNS EYNSSNC 1021 

I : I I I : : : : I 

Db 1416 APPST--PPATCLSQYCADKARDGVCDE ACNSHACQWDGGDC 1455 



Qy 1022 SLSSSENPYATIKDP 1036 

I I : I I I : I I 
Db 1456 SL-TMENPWANCSSP 14 69 



RESULT 10 
US-08-899-232-1 

Sequence 1, Application US/08899232 
Patent No. 6436650 
GENERAL INFORMATION: 
APPLICANT : Artavanis-Tsakonas, Spyridon 
APPLICANT: Qi, Huilin 

TITLE OF INVENTION: ACTIVATED FORMS OF NOTCH AND METHODS BASED THEREON 
FILE REFERENCE: 7326-046 

CURRENT APPLICATION NUMBER: US/ 08/8 99 , 232 
CURRENT FILING DATE: 1997-07-23 
NUMBER OF SEQ ID NOS : 4 
SOFTWARE: PatentlnVer. 2.0 
SEQ ID NO 1 
LENGTH: 2471 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-08-899-232-1 

Query Match 15.0%; Score 1014.5; DB 4; Length 2471; 

Best Local Similarity 23.0%; Pred. No. 4.5e-59; 

Matches 348; Conservative 106; Mismatches 354; Indels 707; Gaps 78; 

Qy 93 QCCPGFYESGEMCVPHCADKCVHGR-CIAPNTCQ CEPGWGGTNCSSACDG 141 

: I I I : I I I : I I h II I : I : I I 
D b 91 RCASGF--TGEDCQYSTSHPCFVSRPCLNGGTCHMLSRDTYECTCQVGFTGKEC 142 

Qy 142 DHWGPHCTSRCQCKNGALCNPITG— ACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQN 199 

I || | | | : | : : I I I I I : I I I I : I 11 = 

Db 143 -QWTDACLSH-PCANGSTCTTVANQFSCKCLTGFTGQKCE TDVNECDI PGHCQH 194 

Qy 200 GATCDHVTG — ECRCPPGYTGAFCEDL CPPGKHG 231 

I I I : : I : I : I I I : I I : I : I I I I I 

Db 195 GGTCLNLPGSYQCQCPQGFTGQYCDSLYVPCAPSPCVNGGTCRQTGDFTFECNCLPGFEG 254 

Qy 232 PQCEQ RCP CQNGGV 245 

||: III Mill 

Db 2 55 STCERNIDDCPNHRCQNGGVCVDGVNTYNCRCPPQWTGQFCTEDVDECLLQPNACQNGGT 314 

Qy 246 CHHVTG--ECSCPSGWMGTVCGQ PCPEGRFGKNC- 277 

I : I I I : I I I I : IN I : I I 

Db 315 CANRNGGYGCVCVNGWSGDDCSENIDDCAFASCTPGSTCIDRVASFSCMCPEGKAGLLCH 374 

Qy 278 -SQEC- — QCHNGGTCDA— ATGQ— CHCSPGYTGERCQ— DECPVGTYGVLCAETCQC 326 

I II I I I I I I I I I I I III: I : I 

Db 375 LDDACISNPCHKGALCDTNPLNGQYICTCPQGYKGADCTEDVDECAM ANSNPC 427 

Qy 327 VNGGKCYHVSGA — CLCEAGFAGERCE ARLCPEGLY 360 

: I I I : 1 I I I I : I I I II III 

Db 428 EHAGKCVNTDGAFHCECLKGYAGPRCEMDINECHSDPCQNDATCLDKIGGFTCLCMPGFK 487 



Qy 



CP CHLE 



NTHSC 378 



|: |: M I I I 

Db 488 GVHCELEINECQSNPCVNNGQCVDKVNRFQCLCPPGFTGPVCQIDIDDCSSTPCLNGAKC 547 



379 — HPMSGECACKPGWSGLYCNET CSPGFYGEAC-QQ 411 

II | i | | : : | : I I I : I I : I I I 

Db 548 IDHPNGYECQCATGFTGVLCEENIDNCDPDPCHHGQCQDGIDSYTCICNPGYMGAICSDQ 607 



QY 



412 I CSCQNGA DCDS VTG K 427 

| I : I I I I I I : I 

D b 608 IDECYSSPCLNDGRCIDLVNGYQCNCQPGTSGVNCEINFDDCASNPCIHGICMDGINRYS 667 



Qy 428 CTCAPGFKGIDC STPCPLGTYGINCSS — RCGCK NDAVCS 465 

||:|]||| I I I I ! I : I I I I : : = 

Db 668 CVCSPGFTGQRCNIDIDECASNPCRKGATCINGVNGFRCICPEGPHHPSCYSQVNECLSN 727 

Qy 466 P-VDGSCT CKAGWHGVDCSI RCPSGT 490 

| : I : I I I I I I I : : I : II 

Db 72 8 PCIHGNCTGGLSGYKCLCDAGWVGINCEVDKNECLSNPCQNGGTCDNLVNGYRCTCKKGF 7 87 



Qy 



4 91 WGFGCNLTCQ CLNGGAC 507 

I: I : 

Db 7 88 KGYNCQVNIDECASNPCLNQGTCFDDISGYTCHCVLPYTGKNCQTVLAPCSPNPCENAAV 847 



Qy 5 08 NTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGH CRCL 5 60 

I | | | | | | | : | : : | : : | : I : I I I I I I 

Db 848 CKESPNFESYTCLCAPGWQGQRCTIDIDE CI SK-PCMNHGLCHNTQGSYMCECP 900 

Qy 561 PGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGI CECAPGFRGTTCQR 613 

| | : | | : | : : I I : I I I I I I : MINI II 

Db 901 PGFSGMDCEEDIDDCLANP CQNGGSCM--DGVNTFSCLCLPGFTGDKCQTDMN 951 

Qv 614 ICSPGFYGHRCSQTCPQCVHSS GPCHHITGL 644 

I III I : I I I II:!: 

Db 952 ECLSEPCKNGGTCSDYVNSYTCKCQAGFDGVHCENNINECTESSCFNGGTC VDGINSF 1009 



Qy 



645 -CDCLPGFTGALC NEV CPSGRFGKNC— AGICT- 674 

I I Mil: I II I -I- 

Db 1010 SCLCPVGFTGSFCLHEINECSSHPCLNEGTCVDGLGTYRCSCPLGYTGKNCQTLVNLCSR 1069 



Qy 



675 — CTNNGTC— NPIDRSCQCYPGWIGSDCSQP 702 

I I I I I : M I I I : I I 

Db 1070 SPCKNKGTCVQKKAESQCLCPSGWAGAYCDVPNVSCDIAASRRGVLVEHLCQHSGVCINA 1129 



Qy 



703 CPPAHWGPNC— IHTC— NCHNGAFCSAYDG— ECKCTPGWTGLYCTQR— 745 

|| : | I : I I : I I I I : I I : I I I : I : I 

Db 1130 GNTHYCQCPLGYTGSYCEEQLDECASNPCQHGATCSDFIGGYRCECVPGYQGVNCEYEVD 1189 



Q y 746 CPLGFYG KDCALICQCQNGADC-DHI SGQ- 773 

III I III I I I I I I I 

Db 1190 ECQNQPCQNGGTCIDLVNHFKCSCPPGTRGLLCEENIDDCARGPHCLNGGQCMDRIGGYS 1249 



Qy 



774 CTCRTGFMGRHCEQKCPSGT 793 

111:111111 I 

Db 1250 CRCLPGFAGERCEGDINECLSNPCSSEGSLDCIQLTNDYLCVCRSAFTGRHCE T 1303 



Q y 7 94 YGYGCRQICDCLNNSTCDHITG TCYCSPGWKGARCDQAGVI IVGNLNSLSRTSTA 848 

: I I : I I I I I : I I I I : I I I I : I : 



Db 1304 FVDVCPQM-PCLNGGTCAVASNMPDGFICRCPPGFSGARCQSS CGQVKC 1351 

Qy 849 LPADS YQI GAIAGI 1 1 LVLWLFLLALFI I YRHKQKGKE SSMPAVTYTPAMRWNA 904 

: I I : : : I I : I : I : 

D b 1352 RKGEQCVHTASGPR-CFCPSPRDCES 1376 

Qy 905 DYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKN 964 

I I : I I : I I I:: III I • I I 

Db 1377 GCASS PCQHGGSC~-HPQRQPPYYS-CQCA~PPFSGSR CELYT 1415 

Qy 965 VNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNS EYNSSNC 1021 

I li II I :| || ::: :| 

Db 1416 APPST--PPATCLSQYCADKARDGVCDE ACNSHACQWDGGDC 1455 

Qy 1022 SLSSSENPYATIKDP 1036 

I I : I I I : I I 
Db 1456 SL-TMENPWANCSSP 1469 



RESULT 11 
US-08-185-432-19 

Sequence 19, Application US/08185432 
Patent No. 5750652 
GENERAL INFORMATION: 

APPLICANT : Artavanis-Tsakonas , Spyridon 
APPLICANT: Busseau, Isabelle 
APPLICANT: Diederich, Robert J. 
APPLICANT: Xu, Tian 
APPLICANT: Matsuno, Kenji 

TITLE OF INVENTION: DELTEX PROTEINS, NUCLEIC ACIDS, AND 
TITLE OF INVENTION: ANTIBODIES, AND RELATED METHODS AND COMPOSITIONS 
NUMBER OF SEQUENCES: 23 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: PENNIE & EDMONDS 
STREET: 1155 Avenue of the Americas 
CITY: New York 
STATE: New York 
COUNTRY: U.S.A. 
ZIP: 10036-2711 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC- DOS /MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/18 5, 432 
FILING DATE: 21-JAN-1994 
CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
NAME: Misrock, S. Leslie 
REGISTRATION NUMBER: 18,872 
REFERENCE/ DOCKET NUMBER: 7326-006 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (212) 790-9090 
TELEFAX: (212) 869-8864/9741 
TELEX: 66141 PENNIE 
INFORMATION FOR SEQ ID NO: 19: 



SEQUENCE CHARACTERISTICS: 
LENGTH: 2703 amino acids 
TYPE: amino acid 
TOPOLOGY: unknown 
MOLECULE TYPE: protein 
US-08-185-432-19 

Query Match 14.5%; Score 978.5; DB 1; Length 2703; 

Best Local Similarity 26.8%; Pred. No. 1.2e-56; 

Matches 290; Conservative 103; Mismatches 296; Indels 395; Gaps 70; 

Qy 7 SCL SFICLLLCHWIGTASPLNLED--PNVCSHWESYSVTVQESYPHPFDQIYYTSC 60 

III : I I : : : I I : : : : : II : : I 

Db 502 SCLDDPGTFRCVCMPGFTGTQCEIDIDECQSNPC LNDGTC 541 

Qy 61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMC VPHCADKCVHGR 117 

I : I I I I : | | | : | | : I : | 

D b 542 HDKINGFKCS CALGF--TGARCQINIDDCQSQPCRNR 576 

Qy 118 CIAPNTCQCEPGWGGTNCS SACDGDHWGPHCTSRCQCKNGALCNPITG-ACH 168 

I I : I : I I I : I I : I : I I : II : : I 

Db 577 GICHDSIAGYSCECPPGYTGTSCEININDCDSN PCHRGKCIDDVNSFKCL 62 6 

Qy 169 CAAGFRGW RCEDR CEQGTYG NDCHQRCQ 196 

I I : I : I : I I hill h I I 

Db 627 CDPGYTGYICQKQINECESNPCQFDGHCQDRVGSYYCQCQAGTSGKNCEVNVNECHSN-P 685 

Qy 197 CQNGATC-DHVTG-ECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVC-HHVTG-E 252 

I lllll I : : I : 1 I I : I I I I h : I It I III I I : 

Db 686 CNNGATCIDGINSYKCQCVPGFTGQHCE KNVDECI S-SPCANNGVCIDQVNGYK 738 

Qy 253 CSCPSGWMGTVC GQP CPEGRFGKNCS — -QECQ-- 282 

ill: I I I I I I I II 

Db 739 CECPRGFYDAHCLSDVDECASNPCVNEGRCEDGINEFICHCPPGYTGKRCELDIDECSSN 7 98 

Qy 283 -CHNGGTC-DAATG-QCHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCY-HV 335 

I : I I I I I I I I I I I h : h hi I I i I I I I 

Db 799 PCQHGGTCYDKLNAFSCQCMPGYTGQKCETNIDDC VTNPCGNGGTCIDKV 848 

Qy 336 SG-ACLCEAGFAGERCEARLCPEGLYGIKCDK-RCPCHLENTHSCHPMSG ECACKP 389 

: I h h I I I I: : : I I : I I : I I I I I I I 

Db 849 NGYKCVCKVPFTGRDCESKMDP CARNRC KNEAKCTPSSNFLDFSCTCKL 897 

Qy 390 GWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTG--KCTCAPGFKGIDC 439 

|: : I I h I : I h I I I I : I I : I I h : I I I 

Db 898 GYTGRYCDEDI DECSLS SPCRNGASCLNVPGSYRCLCTKGYEGRDCAINTDDCA 951 

Qy 440 STPCP LGTYGINC SSRC GCKNDAVCSPVDGS CTCK 474 

Ml : I I I I hi I II I I I I 

D b 952 SFPCQNGRTCLDGIGDYSCLCVDGFDGKHCETDINECLSQPCQNGATCSQYVNSYTCTCP 1011 

Qy 475 AGWHGVDCSIR CPSGTWGFGCN LTCQ CLN 503 

h |: : I I I I : I : | I III 

Db 1012 LGFSGINCQTNDEDCTESSCLNGGSCIDGINGYNCSCLAGYSGANCQYKLNKCDSNPCLN 1071 

Qy 504 GGACNTLDG--TCTCAPGWRGEKC ELPCQDGTYGLNCAERCDCSHADGCHPT 553 

II:: | | | | : I : : I : I I : : I II I 



Db 


1072 


GATCHEQNNEYTCHCPSGFTGKQCSEYVDWCGQSPCENG ATCSQMK HQF 


1 1 ? n 

11_U 


Qy 


554 


TGHCRCLPGWSGVHCD SVCAEGRWGPNCSLPCYCKNGASCSP — DDGICECAPGFRG 

: | : | 1 1 : i 1 1 1 : II III: 1 : : 1 1 : 1 : 1 

S--CKCSAGWTGKLCDVQTISCQDAADRKGLSLRQLCNNG- 1 CKD i UN bnVL.iL,oybiAb 


DUO 


Db 


1121 


XXII 


Qy 


609 


-T- t"s /~~* t — i \ j /-i T T r~\ r~n r~+ T~\ /~\ T 7 T T O C 1 ^ "Pi T T 


O O _? 


Db 


1178 


: | | : 1 1 1 1 1 1 : 1 1 1 
SYCQKEIDECQSQPCQNGGTCRDLIGAYECQCRQGFQGQNCELNIDDCAPNPCQNGGTCH 




Qy 


640 


H--ITGLCDCLPGFTGALC NEVCPSGRFGKNCAGICTCTNNGTCNPIDR SCQC 


^ q n 
d y u 


Db 


1238 


1 I II 1 : i : 1 1 1111:1111 1 i 

, _ _____ „ „ — . __ — _ rj/ ~, rz-inTiTi/nnm/n r~* 7\ Z" 1 TTT. T>, T f~* O /~ 1 TTADI 7"f~ (~ IT 1 U 1 f 1 T C~* 


l-OJ 


Qy 


691 


^, _ „ „ t-. 7% t TT.7 "Pi "KT /~ T T TR"! 


/ I / 


Db 


12 8 6 


M : : I : 1 III 1 1 1 : 1 1 : 

QPGFVGARCEGDINECLSNPCSNAGTLDCVQLVNNYHCNCRPGHMGRHCEHKVDFCAQSP 




Qy 


718 


CHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHIS GQC 


""7 ""7 /I 


Db 


1346 


Ml | : : 1 : 1 1 1 1 1 1 1 : 1 1 : 1 III 1 1 

CQNGGNCNIRQ SGHHCI — CNNGFYGKNCEL SGQDCDSNPCRVGNC 




Qy 


775 


TCRTGFMGRHCEQKCPSGTYGYGCR QICD LLNNS 1 CDH1 1 b- 11- iLorbWrvb 

1 1 I 1 1 1 1 1 1 1 1 : 1 : : 1 II Ml 

WADEGFGYRCE — CPRGTLGEHCEIDTLDECSPNPCAQGAACEDLLGDYECLCPSKWKG 


O /L J) 


Db 


1390 


1447 


Qy 


826 


ARCD 82 9 




Db 


1448 


1 1 1 

KRCD 1451 





RESULT 12 
US-08-899-232-4 

; Sequence 4, Application US/08899232 
; Patent No. 6436650 
; GENERAL INFORMATION: 

; APPLIC7\NT: Artavanis-Tsakonas , Spyridon 
; APPLICANT: Qi, Huilin 

; TITLE OF INVENTION: ACTIVATED FORMS OF NOTCH AND METHODS BASED THEREON 
; FILE REFERENCE: 7326-046 

; CURRENT APPLICATION NUMBER: US/ 08/899, 232 
; CURRENT FILING DATE: 1997-07-23 
; NUMBER OF SEQ ID NOS : 4 

SOFTWARE: PatentlnVer. 2.0 
; SEQ ID NO 4 

LENGTH: 27 03 
; TYPE: PRT 

; ORGANISM: Drosophila sp . 
US-08-899-232-4 

Query Match 14.5%; Score 978.5; DB 4; Length 2703; 

Best Local Similarity 26.8%; Pred. No. 1.2e-56; 

Matches 290; Conservative 103; Mismatches 296; Indels 395; Gaps 7 
Qy 1 SCL SFICLLLCHWIGTASPLNLED — PNVCSHWESYSVTVQES YPHPFDQI YYTSC 60 



Db 502 SCLDDPGTFRCVCMPGFTGTQCEIDIDECQSNPC LNDGTC 541 



Qy 61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMC VPHCADKCVHGR 117 

| s| |||: I M :l I : I : I 

Db 542 HDKINGFKCS CALGF — TGARCQINIDDCQSQPCRNR 576 

Qy 118 CIAPNTCQCEPGWGGTNCS SACDGDHWGPHCTSRCQCKNGALCNPITG-ACH 168 

I I : I : I I I : I I : I : I I : II — I 

Db 577 GICHDSIAGYSCECPPGYTGTSCEININDCDSN PCHRGKCIDDVNSFKCL 626 



QY 



169 CAAGFRGW RCEDR CEQGTYG NDCHQRCQ 196 

| | : | : 1:11 I : I I I I : I I 

Db 627 CDPGYTGYICQKQINECESNPCQFDGHCQDRVGSYYCQCQAGTSGKNCEWWECHSN-P 685 



Qy 197 CQNGATC-DHVTG-ECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVC-HHVTG-E 252 

I I I I I I I : : I : I I I : I I II I : : I I I I I I I I I : 

Db 686 CNNGATCIDGINSYKCQCVPGFTGQHCE KNVDECI S-SPCANNGVCIDQVNGYK 738 

Qy 253 CSCPSGWMGTVC GQP CPEGRFGKNCS — -QECQ-- 282 

I I I I : I I I II 

Db 739 CECPRGFYDAHCLSDVDECASNPCVNEGRCEDGINEFICHCPPGYTGKRCELDIDECSSN 798 

Qy 283 -CHNGGTC-DAATG-QCHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCY-HV 335 

I : I M I I I I I I I I I : : I : I : I I I I I I I I 

Db 7 99 PCQHGGTCYDKLNAFSCQCMPGYTGQKCETNIDDC VTNPCGNGGTCIDKV 848 

Qy 336 SG-ACLCEAGFAGERCEARLCPEGLYGIKCDK-RCPCHLENTHSCHPMSG ECACKP 389 

: | | : I : II I I : : : I I : I I : I I I I I I I 

Db 849 NGYKCVCKVPFTGRDCESKMDP CARNRC KNEAKCTPSSNFLDFSCTCKL 897 

Qy 390 GWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTG — KCTCAPGFKGIDC 439 

| : : | | I : I : I I : I I I I : I I : I I I : : I I I 

Db 898 GYTGRYCDEDI DECS LS S PCRN GAS CLNVPGSYRCLCTKGYEGRDCA1NTDDCA 951 

Qy 440 STPCP LGTYGINC SSRC GCKNDAVCSPVDGS — CTCK 474 

Ml :| I I I 1:1 I II I Ml 

Db 952 SFPCQNGRTCLDGIGDYSCLCVDGFDGKHCETDINECLSQPCQNGATCSQYVNSYTCTCP 1011 



Qy 



475 AGWHGVDCSIR CPSGTWGFGCN LTCQ CLN 503 

| : I : : I I I I : I : II I I I 

Db 1012 LGFSGINCQTNDEDCTESSCLNGGSCIDGINGYNCSCLAGYSGANCQYKLNKCDSNPCLN 1071 



Q y 504 GGACNTLDG— TCTCAPGWRGEKC ELPCQDGTYGLNCAERCDCSHADGCHPT 553 

II:: | | | | : | : : I : I I : : I II I 

Db 1072 GATCHEQNNEYTCHCPSGFTGKQCSEYVDWCGQSPCENG ATCSQMK— HQF 1120 

Qy 554 TGHCRCLPGWSGVHCD— -SVCAEGRWGPNCSLPCYCKNGASCSP--DDGICECAPGFRG 608 

: | : I I I : I I I I : II I I I : I : : I I : I : I 

Db 1121 S--CKCSAGWTGKLCDVQTISCQD7\ADRKGLSLRQLCNNG~TCKDYGNSHVCYCSQGYAG 1177 

Qy 609 TTCQR ICS PGFYGHRCSQTCPQCV HSSGPCH 639 

: | | : I I I I I i : I I I 

D b 1178 SYCQKEIDECQSQPCQNGGTCRDLIGAYECQCRQGFQGQNCELNIDDCAPNPCQNGGTCH 1237 

Qy 640 H--ITGLCDCLPGFTGALC NEVCPSGRFGKNCAGICTCTNNGTCNPI DR SCQC 690 

: | | | I I : I : i I I I I I : I I I I I I 

Db 1238 DRVMNFSCSCPPGTMGIICEINKDDCKPG ACHNNGSC--I DRVGGFECVC 1285 



Qy 691 YPGWIGSDC SQPCP PAHWGPNCIHTCN 717 

||:: I: I I II I : 

Db 1286 QPGFVGARCEGDINECLSNPCSNAGTLDCVQLVNNYHCNCRPGHMGRHCEHKVDFCAQSP 1345 

Qy 718 CHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHIS-— GQC 774 

||||: : M I I I I II I : M : M I I I I 

Db 1346 CQNGGNCNIRQ SGHHCI — CNNGFYGKNCEL SGQDCDSNPCRVGNC 1389 

Qy 775 TCRTGFMGRHCEQKCPSGTYGYGCR QICD— CLNNSTCDHITG— TCYCSPGWKG 825 

I I! I I M I I i I : I : : 

Db 1390 WADEGFGYRCE— CPRGTLGEHCEIDTLDECSPNPCAQGAACEDLLGDYECLCPSKWKG 1447 

Qy 826 ARCD 829 

I I I 

Db 1448 KRCD 1451 



RESULT 13 
US-09-230-652-2 

Sequence 2, Application US/09230652A 
Patent No. 6537775 
GENERAL INFORMATION: 
APPLICANT : Tournier-Las s erve , Elisabeth 
APPLICANT: Joutel, Anne 
APPLICANT: Bousser, Marie-Germaine 
APPLICANT: Bach, Jean-Francois 

TITLE OF INVENTION: GENE INVOLVED IN CADAS I L , METHOD OF DIAGNOSIS AND 
TITLE OF INVENTION: THERAPEUTIC APPLICATION 
FILE REFERENCE: 03715.0048-00000 
CURRENT APPLICATION NUMBER: US/ 0 9/ 2 3 0 , 652A 
CURRENT FILING DATE: 1999-05-17 
EARLIER APPLICATION NUMBER: FR 96 09733 
EARLIER FILING DATE: 1996-08-01 
EARLIER APPLICATION NUMBER: FR 97 04680 
EARLIER FILING DATE: 1997-04-16 
EARLIER APPLICATION NUMBER: PCT/FR97/ 01433 
EARLIER FILING DATE: 1997-07-31 
NUMBER OF SEQ ID NOS : 163 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 2 
LENGTH: 2321 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 

OTHER INFORMATION: human ADNc No. 6537775ch 3 
US-09-230-652-2 

Query Match 14.4%; Score 974; DB 4; Length 2321; 

Best Local Similarity 25.0%; Pred. No. 2.1e-56; 

Matches 304; Conservative 93; Mismatches 282; Indels 537; Gaps 70; 

Q y 59 SCTDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKC 113 

: M : I : I II M : I M M 

Db 250 TCVDGVNTYNC QCPPEW--TGQFCTED-VDECQLQPN 2 83 

Q y H4 -VH--GRC IAPNTCQCEPGWGGTNCSSACDG DHWGPHCTSR CQC 154 

III : : : | | I I I : I I I ill II 



Db 2 84 ACHNGGTCFNTLGGHSCVCVNGWTGESCSQNIDDCATAVCFHGATCHDRVASFYCACPMG 34 3 

Qy 155 KNGALC NPITG— ACHCAAGFRGWRCE— DRCE 183 

| | | | I i : I I I I I I I : II 

Db 344 KTGLLCHLDDACVSNPCHEDAICDTNPVNGRAICTCPPGFTGGACDQDVDECSIGANPCE 403 

Qy 184 QGTYGNDCHQ RCQ CQNGATCDHVTGE — CRCPPGYTG 218 

||:: | : I I : I : I III h I I hll 

Db 404 HLGRCVNTQGSFLCQCGRGYTGPRCETDVNECLSGPCRNQATCLDRIGQFTCICMAGFTG 463 



Qy 



219 AFCE DLCPPGKHGPQCEQRCPCQNGGVC-HHVTG-ECSCPSGWMGTVC 2 64 

: | | || I I I I I I I I II I : I I I I : I : I 

Db 4 64 TYCEVDIDEC QSSPCVNGGVCKDRVNGFSCTCPSGFSGSTCQLDVDECAS 513 



Q y 2 65 GQP CPEGRFGKNCSQ ECQ CHNGGTCDA-ATGQCHCSPG 301 

I | Mill: : I I I : I I I : I I : I I 

Db 514 TPCRNGAKCVDQPDGYECRCAEGFEGTLCDRNVDDCSPDPCHHGRCVDGIASFSCACAPG 573 

Qy 302 YTGERCQDE CPVGTYGVLC AETCQ 325 

INN:: I I I I i I I : I 

Db 57 4 YTGTRCESQVDECRSQPCRHGGKCLDLVDKYLCRCPSGTTGVNCEVNIDDCASNPCTFGV 633 



QY 



326 CVNGGKCYHVSGACLCEAGFAGERCEARL CPEGLYGIKCDKRCP 369 

I : I I | : | : | | I I : | : | I : I II 

Db 634 CRDGINRYD CVCQPGFTGPLCNVEINECASSPCGEGGSCVDGENGFRC--LCPPGS 687 



QY 



370 CHLENTHSC HPMSGECACKPGWSGLYCNE 398 

| | : I I I IIMIMII:: 

Db 688 LPPLC-LPPSHPCAHEPCSHGICYDAPGGFRCVCEPGWSGPRCSQSLARDACESQPCRAG 746 



QY 



399 TCSPGFYGEACQQI — CS CQNGADCDSVTGK CTCAPGFKG-- 436 

|| || | |: : |: |::| |:| |: hi |::l 
Db 7 47 GTCSSDGMGFHCTCPPGVQGRQCELLSPCTPNPCEHGGRCESAPGQLPVCSCPQGWQGPR 8 06 



Qy 437 IDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTW 491 

: | : | I I : II I : : I II I I I : I I 

807 CQQDVDECAGPAPCGPHGI-CTNLAG SFSCTCHGGYTGPSCDQDIND 852 



Db 

Qy 



492 GFGCNLTCQCLNGGACNTLDG TCTCAPGWRGEKC ELPCQDGTYGLNCA 539 

| : I II I i : I M : i : I I I : I : I II II I 
Db 853 CDPN-PCLNGGSCQ--DGVGSFSCSCLPGFAGPRCARDVDECLSNPCGPGT CT 902 



Qy 54 0 ERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGI 599 

I : I I I 1 : I I I : I : = I I I = 

Db 903 D HVASFTCTCPPGYGGFHCEQDL PDCS-PSSCFNGGTCV DGV 943 

Qy 600 CECAPGFRGTTCQR ICSPGFYGHRC SQTCPQC 631 

1111:111 : I I IN I I I I I 

Db 944 NSFSCLCRPGYTGAHCQHEADPCLSRPCLHGGVCSAAHPGFRCTCLESFTGPQCQTLVDW 1003 



Qy 



632 VHSSGPCHHITGLCDCLPGFTGALCNE 658 

: I I I I I I : : I I I : 

Db 1004 CSRQPCQNGGRCVQTGAYCLCPPGWSGRLCDIRSLPCREAAAQIGVRLEQLCQAGGQCVD 1063 



Qy 



659 VCPSGRFGKNC AGIC TCTNNGTCNPI--DRSCQCYPGWIGSDC 699 

Ml II I : I I I : M I I : I I I : I : I 

Db 1064 EDSSHYCVCPEGRTGSHCEQEVDPCLAQPCQHGGTCRGYMGGYMCECLPGYNGDNCEDDV 1123 



Qy 


700 


Db 


1124 


Qy 


720 


Db 


1184 


Qy 


750 


Db 


1238 


Qy 


1 Q Q 

loo 


Db 


1298 


Qy 


815 


Db 


1358 



S q PC PPAHWGPNCIHTCNCH 719 

DECASQPCQHGGSCIDLVARYLCSCPPGTLGVLCEINEDDCGPGPPLDSGPRCLHNGTCV 118^ 

N--GAFCSAYDGECKCTPGWTGLYCTQ RCPLG 74 9 

: I I 1111:1111 I 1 

DLVGGF RCTCPPGYTGLRCEADINECRSGACHAAHTRDCLQDPGGGFRCLCHAG 123 r 

FYGKDCALI CQ CQNGADCDHISG QCTCRTGFMGRHCEQ 7 87 

III: I : I I : I I I II I I I I = 



-KCPSGTYGYGCRQI CDCLNNSTCDHIT 814 

I I I I I I II : : I 



-GTCYCSPGWKGARCD 82 9 
11:11111: 



RESULT 14 
US-09-467-997-1 

Sequence 1, Application US/09467997 
Patent No. 6379925 
GENERAL INFORMATION: 
APPLICANT: Kitajewski, Jan 
APPLICANT: Uyttendaele, Hendrik 

TITLE OF INVENTION: ANGIOGENIC MODULATION BY NOTCH SIGNAL TRANSDUCTION 
FILE REFERENCE: 538 63-A-PCT-US 
CURRENT APPLICATION NUMBER: US/ 0 9/4 67 , 997 
CURRENT FILING DATE: 1999-12-20 
NUMBER OF SEQ ID NOS : 10 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 1 
LENGTH: 1964 
TYPE: PRT 
ORGANISM: mouse 
US-09-467-997-1 

Query Match 14.1%; Score 953.5; DB 4; Length 1964; 

Best Local Similarity 25.0%; Pred. No. 4e-55; 

Matches 297; Conservative 75; Mismatches 303; Indels 513; Gaps 61; 

DKCVHGRCI APNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALC 160 

: ||: 11111:11 : I Nil I 

N GGTCLRLSRGQGICQCAPGFLGETC QFPDPCRDTQLCKNGGSCQALL 81 



Qy 


109 


Db 


32 


QY 


161 


Db 


82 


Qy 


182 


Db 


142 



I : I : I I : I I I II : 



M I I I I I IN I : :| I I I I I : I II 



181 



223 



Qy 224 LCPPGKHGPQCEQR— CP— CQNGGVC HHVTGECSC 255 

I I I I : I I I I : I M 

Db 2 02 CPQGTSCHNTLGSYQCLCPVGQEGPQCKLRKGACPPGSCLNGGTCQLVPEGHSTFHLCLC 2 61 

Qy 256 PSGWMGTVCGQ PCPEGRFGKNCSQ— ECQ 282 

| | : | | | | | : | : | | : I I : 

Db 2 62 PPGFTGLDCEMNPDDCVRHQCQNGATCLDGLDTYTCPCPKTWKGWDCSEDIDECEARGPP 321 



Qy 



283 -CHNGGTCDAATG--QCHCSPGYTGERCQDE CPVG 314 

I I I I I I I I I I : I I : = Ml 

Db 322 RCRNGGTCQNTAGSFHCVCVSGWGGAGCEENLDDCAAATCAPGSTCIDRVGSFSCLCPPG 381 



Qy 315 TYGVLC AETCQ 325 

1:11 : H 

Db 382 RTGLLCHLEDMCLSQPCHVNAQCSTNPLTGSTLCICQPGYSGSTCHQDLDECQMAQQGPS 441 

Qy 32 6 -CVNGGKCYHVSGA— CLCEAGFAGERCEAR LCPEG 35 8 

| : I I I : i : I I I I : I I M I III I 

Db 442 PCEHGGSCINTPGSFNCLCLPGYTGSRCEADHNECLSQPCHPGSTCLDLLATFHCLCPPG 501 



Qy 



359 LYGIKCD KRC PCHLENTHSCHPMSG — ECACKPGWSGLYCNE 398 

|| | : I II I : II : : | I I I : : i I : 

Db 502 LEGRLCEVEVNECTSNPC— LNQAACHDLLNGFQCLCLPGFTGARCEKDMDECSSTPCAN 559 



Qy 



399 TCSPGFYGEACQQICS CQNGADCDSVTGK--CTCAPGFKGI 437 

| | I I I I : : I I I I : I I I Ml I 

Db 560 GGRCRDQPGAFYCECLPGFEGPHCEKEVDECLSDPCPVGASCLDLPGAFFCLCRPGFTGQ 619 



Qy 



438 DCSTP CPLGTYGI NCSSRCG 457 

|| I I I : I III 

Db 62 0 LCEVPLCTPNMCQPGQQCQGQEHRAPCLCPDGSPGCVPAEDNCPCHHGHCQRSLCVCDEG 67 9 



Qy 



458 CKNDAVCSPVDG — SCTCKAGWHGVDCS IRCPSGTW 491 

I : II : I I I I I : I : I I I I I 

Db 680 WTGPECETELGGCISTPCAHGGTCHPQPSGYNCTCPAGYMGLTCSEEVTACHSGPCLNGG 739 



Qy 



4 92 GFGCN LTCQCLNGGACNTLDGT--CTCAPGWRGEKC 52 5 

|:| r : I I I I I I I I I I I I :: I I 

Db 740 SCSIRPEGYSCTCLPSHTGRHCQTAVDHCVSASCLNGGTCVNKPGTFFCLCATGFQGLHC 799 



Qy 



52 6 E LPCQDGTYGLNC AERCDCSHADGCHPT 553 

I I I I I I I I I I : 

Db 800 EEKTNPSCADS PCRNKATCQDTPRGARCLCS PGYTGS SCQTLI DLCARKPCPHTARCLQS 859 



Qy 554 — TGHCRCLPGWSGVHCD-- SVCAEGRWGPNCSLPCYCKNGASCSPDDG— ICECAPGF 606 

: I II 11:1 II I : : 1 = 1 I I I 

Db 8 60 GPSFQCLCLQGWTGALCDFPLSCQKAAMSQGIEISGLCQNGGLCI-DTGSSYFCRCPPGF 918 

Qy 607 RGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFG 666 

: | | | : | I INI: Ml M : I III 

Db 919 QGKLCQDNVNP C~ EPNPCHHGS— TCVPQPSGYVCQ— CAPGYEG 95 8 



Qy 



667 KNCAGI — C TCTNNGTC— NPIDRSCQCYPGWIGSDC SQPCPPAHWGP 710 

: I I : : I I Mill I I I I M M I Ml M 
Db 959 QNCSKVLDACQSQPCHNHGTCTSRPGGFHCACPPGFVGLRCEGDVDECLDRPCHPS 1014 



Qy 



711 NCIHTCNCHNGAFCSAYDGECKCTPGWTGLYC 742 



Db 


1015 


Qy 




Db 


1068 


Qy 


791 


Db 


1128 



I || : : I : I : I I I I I I 
— -GTAACH— SLANAF— YCQCLPGHTGQRCEVEMDLCQSQPCSNGGSCEITTGPPPGF 106, 

TQRCPLGFYGKDC ALIC QCQNGADC DHI SGQCTCRTGFMGRHC-EQKCP 79 0 

1 I I I I I I III I II I I I : 1 I I I I 

TCHCPKGFEGPTCSHKALSCGIHHCHNGGLCLPSPKPGSPPLCACLSGFGGPDCLTPPAP 112 > 

SGTYGYGCRQICDCLNNSTCDHITG TCYCS PGWKGARCDQAG 832 

II Ihlll I I I I I I I : I 



RESULT 15 
US-09-214-278-2 

Sequence 2, Application US/09214278 
Patent No. 6291210 
GENERAL INFORMATION: 
APPLICANT: Sakano, Seiji 
APPLICANT: Itoh, Akira 

TITLE OF INVENTION: DIFFERENTIATION- SUPPRESSIVE POLYPEPTIDE 
FILE REFERENCE: KP-8576 

CURRENT APPLICATION NUMBER: US / 09/ 2 1 4 , 27 8 
CURRENT FILING DATE: 1999-01-26 
NUMBER OF SEQ ID NOS : 32 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 2 
LENGTH: 105 5 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-214-278-2 

Query Match 11.7%; Score 790; DB 3; Length 1055; 

Best Local Similarity 27.7%; Pred. No. 1.5e-44; 

Matches 291; Conservative 85; Mismatches 377; Indels 296; Gaps 69; 

Qy 15 LLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQI YYTSCTDILNWFKCTRHRV 74 

|| : | : I I I I : : : : | : | | : : : III 
Db 135 LLIERVSHAGMINPEDRWKSLHFSGHVAHLELQIRVRCDENYYSATCN KFCRPRN 189 

Oy 75 SYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPH-CADKC— VHGRCIAPNTCQCEPGWG 131 

: | 1 | : | I : I : I I I : I I I I I : I I I 

Db 190 DF FGHYTCDQYGNKA-CMDGW — MGKECKEAVCKQGCNLLHGGCTVPGECRCSYGWQ 243 

Qy 132 GTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDC 191 

|| | | : | : I | : | : I I : II 
Db 2 44 GRFCD ECVPYPGCVHGSCVEP— WQCNCETNWGGLLCDKDL NYC 2 85 

Qy 192 HQRCQCQNGATCDHVTGE CRCPPGYTGAFCEDLCPPGKHGPQCEQRC PCQNGGV 245 

INN: : 11111:111 : I I M III 

Db 28 6 ESHHPCTNGGTCINAEPDQYRCTCPDGYSGRNCE KAEHACTSNPCANGGS 335 

Qy 246 CHHVTG--ECSCPSGWMGTVC GQP CPEGRFGKNC- 277 

III I I I I I I I I I I I II I I 

Db 336 CHEVPSGFECHCPSGWSGPTCALDIDECASNPCAAGGTCVDQVDGFECICPEQWVGATCQ 395 

Q y 278 --SQECQ CHNGGTCDAATG--QCHCSPGYTGERCQ DECPVGTYGVLCAETCQCV 327 

: | | : | I : I I I I I I : I I : : I M 



Db 396 LD7\NECEGKPCLNAFSCKNLIGGYYCDCIPGWKGINCHINVNDC RGQCQ 444 

Qy 328 NGGKCYH-VSG-ACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGEC 385 

: I I I 1:11:111111 I M IN M 

Db 445 HGGTCKDLWGYQCVCPRGFGGRHCE LERDKC-ASSPCH SGG- 485 

Q y 386 ACKPGWSGLYCNETCSPGFYGEACQ QICS CQNGADCDSVTGK--CTCAPGFKGI 437 

I : I : I : : : I I : I I I I : : I II II 

Db 486 LCEDLADGFHCH — CPQGFSGPLCEVDVDLCEPSPCRNGARCYNLEGDYYCACPDDFGGK 543 

438 DC ST PCPLGTYGINCSSRCGCKNDA VCSPVDGSCTCKAGWHGVDCS 483 

: | I I I I I I I I : I I Mill: I : I 

Db 544 NCSVPREPCPGGA CRVIDGCGSDAGPGMPGTAASGVCGP-HGRCVSQPGG— NFS 595 



Qy 



Qy 



484 IRCPSGTWGFGCN LTCQCLNGGAC-NTLDG-TCTCAPGWRGEKCE LP- 528 

Mill: I Mill: H I I 11111= I I 

Db 596 CICDSGFTGTYCHENIDDCLGQPCRNGGTCIDEVDAFRCFCPSGWEGELCDTNPNDCLPD 655 

Qv 529 CQDGTYGLNCAER CD CSHADGCHPT — TGHCRCLPG 562 

Mill : I :: I I I I I 

Db 656 PCHSRGRCYDLVNDFYCACDDGWKGKTCHSREFQCDAYTCSNGGTCYDSGDTFRCACPPG 715 

Qy 563 WSGVHCDSVCAEGRWGPNCSLPCYCKNGASC— SPDDGICECAPGFRGTTCQRICSPGFY 620 

|| Ml : : I II I I I : I I I I hill 
Db 716 WKG STCAVAK-NSSC-LPNPCVNGGTCVGSGASFSCICRDGWEGRTCT 7 61 

Qy 621 GHRCSQTCPQCVHSSGPCHHITGL CD CL P GFT GALCN EVCP S GRFGKNCAGI C 673 

| : | : : | | : I : hi III I I : I I I I 

Db 762 -HNTNDCNPLPCYNGGIC — VDGVNWFRCECAPGFAGPDCRINIDECQS SPCAYGA 814 

674 TCTN — NGTCNPIDRSCQCYPGWIGSDC SQPCPPAHWGPNCIHTCNCHN 720 

| | : I I I I II I I I : I I I : : M 

Db 815 TCVDEING YRCSCPPGRAGPRCQEVIGFGRSCWSRGTPFPH-GSSWVEDCNS— 865 

Qy 721 GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADC-DHISGQC TC 776 

| M I : I I : I I I : I I I I I I : I I I I 

8 66 CRCLDGRRDCSKVWCG WKPCLLA — GQPEALSAQCPLGQRCLEKAPGQCLRPPC 917 



Qy 



Db 

Qy 



777 RT-GFMGRHCEQKCP SGTYGYGCRQICDCLNNSTCDHI TGTCYCS PG 822 

|| I II I :: I II: I II I 

Db 918 EAWGECGAEEPPSTPCLPRSGHLDNNCARLTLHFNR DHVPQGTTVGAICSGIRSLPA 97 4 

Q y 823 WKGARCDQAGVIIVGNLNS-LSRTSTA LPADS YQI GAIAGI 1 1 LVLWLFLL 873 

| : | : : : I I I I I I I I I : : 
Db 975 T RAVARD RL LVLL C D RAS S GAS AVE VAVS F S P ARD L P D S S L I QGAAHAI VAAI 1027 



Qy 87 4 ALFII YRHKQKGKESSMPAVTYTPAMRW 9 02 

I : I I : I I I II 
Db 102 8 TQRGNS SLLLAVTEVKVETW 1048 



Search completed: March 26, 2004, 16:13:05 
Job time : 46.759 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: March 26, 2004, 16:05:25 ; Search time 23.117 Seconds 

(without alignments) 
4743.616 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-10-092-390-2 
6744 

1 MVISLNSCLSFICLLLCHWI SSPKQEDSGGS SSNSSSSSE 1140 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 



283366 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database 



PIR li 



pirl : * 
pir2 : * 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 



No. Score Match Length DB ID Description 
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MEGF6 protein - ra 
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hypothetical prote 


3 


1805.5 
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2 
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5 


1032 


15 
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.2 


2531 


2 


A46019 


notch-1 protein - 
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notch3 protein - h 
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13 . 
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6 
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1 
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1 
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2 
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11 . 
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2 
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crumbs protein - f 


22 
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11 . 


4 
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2 
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tenascin-C - human 


23 
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11 . 


4 


1746 


1 
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tenascin precursor 
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1 
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1 
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25 
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2 
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jagged protein pre 
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716 


10 . 


, 6 


3106 


1 
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30 
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1790 


1 


MMFFB1 
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1? 
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9 
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. 9 


1786 


1 


MMHUB1 


laminin beta-1 cha 
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666 
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T23064 


hypothetical prote 


41 
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9 
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2 
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protein T22A3.8 [i 


42 
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9 


. 9 
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2 
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43 
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9 


. 8 
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1 


MMMSBl 
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44 
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9 


.7 


1429 


2 
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homeotic protein 1 


45 
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9 


.5 


1797 


2 


A55677 


laminin beta-2 cha 



ALIGNMENTS 



RESULT 1 
T13954 

MEGF6 protein - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 20-Sep-1999 #sequence_revision 20-Sep-1999 #text_change 21-Jul-2000 
C; Access ion: T13 954 

R;Nakayarna, M. ; Nakajirna, D.; Nagase, T.; Nomura, N. ; Seki, N . ; Ohara, O. 
Genomics 51, 27-34, 1998 

A; Title: Identification of high-molecular-weight proteins with multiple EGF-like 
motifs by motif-trap screening. 

A;Reference number: Z14126; MUID : 98360089 ; PMID:9693030 
A; Access ion: T13954 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1574 <NAK> 

A;Cross-references: EMBL : AB011532 ; NI D : g3 4 4 92 93 ; P I DN : BAA3 2 4 62.1; PID:g3449294 
A; Experimental source: strain Sprague-Dawley ; brain 
C; Genetics : 
A; Gene: MEGF6 



Query Match 29.0%; Score 1958; DB 2; Length 1574; 

Best Local Similarity 41.3%; Pred. No. 2.6e-96; 

Matches 344; Conservative 77; Mismatches 306; Indels 106; Gaps 16; 

Qy 95 CP-GFYESGEMCVP--HCADKCVHGRC-IAPNTCQCEPGWGGTNCSSACDGDHWGPHCTS 150 

| | M | | : I Ml:: I II I I : I I I I I I = I I I : 

Db 602 CPKGFY — GKHCRKKCHCANR GRCHRLYGACLCDPGLYGRFCHLACPPWAFGPGCSE 656 

Qy 151 RCQCKNG--ALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTG 2 08 

||: III I : I I I I I : I I I : I I I : I I Mill I I I : I 

D b 657 DCLCEQSHTRSCNPKDGSCSCKAGFQGERCQAECESGFFGPGCRHRCTCQPGVACDPVSG 716 

q y 2 09 ECR — CPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQ 2 66 

Ml Mill I I I I 1 I I I 1 I I I I I I I 1 II I I I I 

Db 717 ECRTQCPPGYQGEDCGQECPVGTFGWCSGSCSCV-GAPCHRVTGECLCPPGKTGEDCGA 775 

Qy 2 67 PCPEGRFGKNCSQEC-QCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQ 325 

| I I I I : I |:| I : I : I : I I I I I I : I M I I 

Db 776 DCPEGRWGLGCQEICPACEHGASCNPETGTCLCLPGFVGSRCQDTCSAGWYGTGCQIRCA 835 

Qy 326 CVNGG KCYH VSGACLC 341 

Ml I I I I I I I I 

Db 8 36 CANDGHCDPTTGRCSCAPGWTGLSCQRACDSGHWGPDCIHPCNCSAGHGNCDAVSGLCLC 8 95 

Qy 342 EAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCS 401 

| | | : I I I I : I : I I I I : : : I I I : : I : I I I I 111= I I 
Db 896 EAGYEGPRCE-QSCRQGYYGPSCEQKCRC—EHGAACDHVSGACTCPAGWRGSFCEHACP 952 



QY 



402 PGFYG EA CQQICSCQNG 418 

||:| :| I Ml:l II 

Db 953 AGFFGLDCDSACNCSAGAPCDAVTGSCICPAGRWGPRCAQSCPPLTFGLNCSQICTCFNG 1012 



Qy 419 ADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWH 478 

I I I I I I I : : I 1 M I M II M:l I I : I I M II 

Db 1013 ASCDSVTGQCHCAPGWMGPTCLQACPPGLYGKNCQHSCLCRNGGRCDPILGQCTCPEGWT 1072 

Q y 479 GVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAP GWRGEKCELPCQDGTYGLNC 538 

| : | | | : I I I I I I : I I I : I I I I I I I : I I : I I I : I : : I 

Db 1073 GLACENECLPGHYAAGCQLNCSCLHGGICDRLTGHCLCPAGWTGDKCQSSCVSGTFGVHC 1132 

Qy 539 AERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDG 598 

Ml || II I I III I II: I I :M: I I I I 1 I 

Db 1133 EEHCACRKGASCHHVTGACFCPPGWRGPHCEQACPRGWFGEACAQRCLCPTNASCHHVTG 1192 

Qy 59Q ICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNE 658 

| | | | | | : I : : I I I : I I III : I = I : I I I : I I : 

Db H93 ECRCPPGFTGLSCEQACQPGTFGKDCEHLC-QCPGETWACDPAS GVCTCAAGYHGTGCLQ 1251 

Q y 659 VCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNC 718 

| | | | | : | | I 1:1 : I I I : : I : I I I II : I I : I I I I 

Db 1252 RCPSGRYGPGCEHICKCLNGGTCDPATGACYCPAGFLGADCSLACPQGRFGPSCAHVCAC 1311 

Q y 719 HNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRT 778 

Ml llhlllhl I i Ml I I I I'll I : I I : I 

Db 1312 RQGAACDPVSGACICSPGKTGVRCEHGCPQDRFGKGCELKCACRNGGLCHATNGSCSCPL 1371 



Qy 779 GFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQA 831 

1:11111 11:111 I I I I I : h I I I I I I : I I : : 
Db 1372 GWMGPHCEHACPAGRYGAACLLECFCQNNGSCEPTTGACLCGPGFYGQACEHS 1424 



RESULT 2 
T27283 

hypothetical protein Y64G10A.f ~ Caenorhabdi tis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 15-Oct-1999 
C/Accession: T27283 
R;Ainscough, R. 

submitted to the EMBL Data Library, September 1999 
A;Reference number: Z20336 
A;Accession: T27283 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1620 <WIL> 

A;Cross-references: EMBL : AL1104 98 ; NID : el542303 ; PIDN: CAB54471 . 1; CESP : Y64G10A. f 
A; Experimental source: clone Y64G10A 
C; Genetics : 

A; Gene: CESP : Y64G10A. f 

A;Introns: 77/1; 116/1; 198/1; 282/1; 365/1; 425/1; 466/1; 548/1; 559/1; 601/1; 
625/1; 715/1; 782/3; 845/1; 895/2; 956/1; 1105/1; 1221/1; 1307/1; 1445/2 

Query Match 28.2%; Score 1900; DB 2; Length 1620; 

Best Local Similarity 38.6%; Pred. No. 3.1e-93; 

Matches 331; Conservative 88; Mismatches 307; Indels 132; Gaps 17; 

CP-GFYESGEMCVPHCADKCVHGRCIAP-NTCQCEPGWGGTNCSSACDGDHWGPHCTSRC 152 

I I M I I I I |:IM | | i | : | | : I : I i 

CPDGFY--GSQCNLKCRMDCPNGRCDPVFGYCTCPDGLYGQSCEKPCPHFTFGKNCRFPC 7 69 

QC--KNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTGEC 210 

: I : I I : : I I : I I : I I : I I I II 111 = 1 

KCARENSEGCDEITGKCRCKPGYYGHHCKRMCSPGLFGAGCAMKCSCPAGIRCDPVTGDC 82 9 

--RCPPGYTGAFCEDLCPPGKHGPQCEQRCPC QN GGVCHH VT GEC S CPS GWMGT 262 

: | | | | | I : III I I I I : 1 I I I I I I I I I : I I I 



Qy 


95 


Db 


712 


Qy 


153 


Db 


770 


Qy 


211 


Db 


830 


Qy 


263 


Db 


890 


Qy 


304 


Db 


950 


Qy 


364 


Db 


1009 


Qy 


424 


Db 


1067 



: | | I : I I I : I I I I I I : I I I I : I I : 

LCDQCLIFVETIEFDIAFSINVIACAPNTYGPNCAHTCSCVNGAKCDESDGSCHCTPGFY 94 9 

GERCQDECPVGTYGVLCAETCQCWGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIK 3 63 

| | : I I I : I : I : I : I I I I : I : I I I : : I : : I : : I I : I 

GATCSEVCPTGRFGIDCMQLCKCQNGAICDTSNGSCECAPGWSGKKCD-KACAPGTFGKD lOOt 

CDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDS 42 3 

| | : I I : I I I I I I I I I I I = M I I ' I I : I M II I I I I I I 

CSKKCDC-ADGMH-CDPSDGECICPPGKKGHKCDETCDSGLFGAGCKGICSCQNGATCDS 10 6* 

VTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGC— KND AVCSPVDGSCTC 47 3 

I I I I I I I : : I I I I I I : I I : : I I I I I I I I I I 



Qy 


474 


KAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQD-- 

1 I I 1 1 1 III III : 1 1 1 1 1 : 1 : : 1 1 1 1 : 1 : 1 1 1 : 
PAGWTGPDCQTSCPLGRHGEGCRHSCQCSNGASCDRVTGFCDCPSGFMGKNCESECPEGL 




Db 


1127 


1 1 0 D 


Qy 


532 


1 : 1 1 1 1 : 1 1 : 1 : 1 

WGSNCMKHCLCMHGGECNKENGDCECIDGWTGPSLCPFGQFGRNCAQRCNCKNGASCDRK 


^ ^ 

.J 3 O 


Db 


1187 


1 O A d 


Qy 


554 


TGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQR 

I I I I I I 1 1 1 1 1 1 : 1 1 : 1 1 1 1 : 1 II 1 1 1 1 1 1 1 : 1 1 1 1 
TGRCECLPGWSGEHCEKSCVSGHYGAKCEETCECENGALCDPISGHCSCQPGWRGKKCNR 


o lo 


Db 


1247 


I o r\ tz 
13 0 6 


Qy 


614 


ICSPGFYGHRCSQTCPQCVHSSGPCHH1TGLCDCLPGFTGALCNEVCP 

1 i : : 1 | | | : | : I : I 1 I 1 : I 1 1 1 : 1 1 1 : 1 1 

PCLKGYFGRHCSQSC-RCANSKS-CDHISGRCQCPKGYAGHSCTELCPDGTFGESCSQKC 


661 


Db 


1307 


i o a a 
1364 


Qy 


662 


SGRFGKNCAGICTCTNNGTCNPIDRSCQC 

1111:1 : 1 : 1 1 : Ml 
DCGENSMCDAISGKCFCKPGHSGSDCKSGCVQGRFGPDCNQLCSCENGGVCDSSTGSCVC 


r d n 

by U 


Db 


1365 


1424 


Qy 


691 


YPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGF 

I | : | | : I 1 : 1 1 1 1 1 1 1 1 1 : : : : 

PPGYIGTKCEIACQSDRFGPTCEKICNCENGGTCDRLTGQCRCLPGFTGMTCNQVCPEGR 


h c n 
/ D U 


Db 


1425 




Qy 


751 


YGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTC 

: | | I : I I 1 1 : 1 1 : 1 1 III 1 1 I 1 1 1 1 1 1 1 1 : 1 : 1 
FGAGCKEKCRCANG-HCNASSGECKCNLGFTGPSCEQSCPSGKYGLNCTLDCECYGQARC 


oil) 


Db 


1485 


1543 


Qy 


811 


DHITGTCYCSPGWKGARC 82 8 

1:11111 1 : 1 1 

DPVQGCCDCPPGRYGSRC 1561 




Db 


1544 





RESULT 3 

T26972 

hypothetical protein Y47H9C.4 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text__change 17-Mar-2000 
C; Accession: T2 6972 
R;Harris, B. 

submitted to the EMBL Data Library, October 1998 
A/Reference number: Z20293 
A;Accession: T26972 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1111 <WIL> 

A;Cross-references : EMBL : AL032 657 ; P I DN : CAA2 17 39.1; GSPDB : GN00019 ; CESP : Y4 7H9C . 4 

A; Experimental source: clone Y47H9C 

C; Genetics : 

A; Gene: CESP : Y47H9C . 4 

A; Map position: 1 

A;Introns: 50/2; 84/2; 150/1; 238/3; 342/3; 797/1; 851/1; 947/2; 1017/1; 1083/1 
C; Superf amily : unassigned ankyrin repeat proteins; ankyrin repeat homology; EGF 
homology 



Query Match 26.8%; Score 1805.5; DB 2; Length 1111; 

Best Local Similarity 31.8%; Pred. No. 2.3e-88; 



Matches 373; Conservative 162; Mismatches 423; Indels 215; Gaps 



Qy 21 GTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYT SCTDILNWFKCTRHR 73 

Ml : : I I : I : : I : : : I I : I I : I 

Db 35 GTTEP QGDHVCT VKTIVDDY — ELKKVIHTWYNDTEQCLNPLTGFQC 80 

Qy 74 VSYRTAYRHGEKTMYRRK SQCCPGFYESGE-MCVPHCADKCVHGRCIAPNTC 124 

I : | : | I : I : I I i I : I : : : I : I I I hill I 

Db 81 TVEKRGQKASYQRQLVKKEKYVKQCCDGYYQTKDHFCLPDCNPPCKKGKCIEPGKC 13 6 

Qy 125 QCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQ 184 

: 1 : I I : I I I : h I II h I h I I I h I I I I : I h I I I I I 

Db 137 ECDPGYGGKYCASSCSVGTWGLGCSKSCDCENGANCDPELGTCICTSGFQGERCEKPCPD 196 

Qy 185 GTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGG 244 

: I : I : I I I II h hi I h I I I : i I I I : h I I I I I 

Db 197 NKWGPNCVKSCPCQNGGKCNK-EGKCVCSDGWGGEFCLNKCEEGKFGAECKFECNCQNGA 255 

Qy 245 VCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTG 304 

| : I : I i I i : I : I I III h I : I I I | I : : : I : I I h I I 

Db 256 TCDNTNGKCICKSGYHGALCENECSVGFFGSGCTQKCDCLNNQNCDSSSGECKCI-GWTG 314 

Qy 3 05 ERCQDECPVGTYGVLCAETCQCV NGGKCYHVSGACLCEAGFAGERCEARLCPEG 3 58 

: | | | : | : | : | I : I : I I I I : h I : h I I 

Db 315 KHCDIGCSRGRFGLQCKQNCTCPGLEFSDSNASCDAKTGQCQCESGYKGPKCDERKCDAE 37 4 

Qy 359 LYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEAC — QQICSCQ 416 

II I I I I III I I : I I I I I h I I II II I i : I 

Db 375 QYGADCSKTCTCVRENTLMCAPNTGFCRCKPGFYGDNCELACSKDSYGPNCEKQAMCDWN 434 

Qy 417 NGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAV-CSPVDGSCTCKA 475 

: : : I : I I I I I I I : I I I I I I II I h : I I II I I I I 

Db 435 HASECNPETGSCVCKPGRTGKNCSEPCPLDFYGPNCAHQCQCNQRGVGCDGADGKCQCDR 494 

Qy 476 GWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYG 535 

Ml | I I : I : I I hi I h : I I I I I : I h : I : h M 

Db 495 GWTGHRCEHHCPADTFGANCEKRCKCPKGIGCDPITGECTCPAGLQGANCDIGCPEGSYG 554 

Qy 536 LNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSP 595 

I ||: I I I I 1 I h I I : I : : I : : I : I I I I : I I I I 

Db 555 PGCKLHCKCVNGK-CDKETGECTCQPGFFGSDCSTTCSKGKYGESCELSCPCSD-ASCSK 612 

Qy 596 DDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHI TGLC-DCLPGF 651 

Ml I : I : I : I I : I I : I : I I hi I I I 

Db 613 QTGKCLCPLGTKGVSCDQKCDPNTFGFLCQETV TPSPCASTDPKNGVCLSCPPGS 667 

Qy 652 TGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPN 711 

: | | I I : I : I I : h I : hi I I I h I I h I I : I 

Db 668 SG1HCEHNCPAGSYGDGCQQVCSCADGHGCDPTTGECICEPGYHGKTCSEKCPDGKYGYG 727 

Qy 712 CIHTC-NCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNG-ADCDH 769 

i i I : h I : I I I I I I h I I h I I : h I : h 

Db 728 CALDCPKCASGSTCDHINGLCICPAGLEGALCTRPCSAGFWGNGCRQVCRCTSEYKQCNA 787 

Qy 770 ISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNST — CDHITGTCYCSPGWKGAR 827 

: h |: I Ml I : : I III I : I I : I h : : I h I I h I 

Db 7 88 QTGECSCPAGFQGDRCDKPCEDGYYGPDCIKKCKCQGTATSSCNRVSGACHCHPGFTGEF 8 47 



Qy 828 CDQAGVIIVGNLNSLSRTST ALP ADSYQIGAIAG 861 

! : : I I I I I I : I I 

Db 84 8 C HALCPESTFGLKCSKECPKDGCGDGYECDAAIGCCHVDQMSCGKAKQE 8 96 

Qy 862 IIILVLWLF LLALFI I YRHK-QKGKES SMPAVT YT PAMRV 901 

: I : : | | I : I I I I I : I I I I : I I I : : 
Db 897 FEALNGAGRSTGLTWFFVLLIVALCGGLGLIALF — YRNKYQKEKDPDMPTVSF 948 

Qy 902 VNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVN 961 

:: : I : I : : I : I 

D b 94 9 HKAPNNDEGREFQNPLY SRQSVFP DSDAFSSENNGNHQ 986 

Qy 9 62 LKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYM GKSLKDLGKN — 1013 

I I I I I : : I : I I I 

Db 987 GGPPN — GLLTLEEEELENKKIHGRSAAGRGNNDY 1019 

Qy 1014 SEYNSSNCSLSSSE NPYATIKDPPVLI PKSSECGYVEMK SPA 1055 

I :| | : I I : I I : | I : : I : I III 

Db 1020 ASLDEVAGEGSSSSASASASRRGLNSSEQSRRP — LLEEHDEEEFDEPHENSI SPAHAVT 1077 

Qy 1056 RRDSPYAEINN STSANR NVY 1075 

: : I I I : I : : III: I : I 

Db 1078 TSNHNENPYADISSPDPVTQNSANKKRAQDNLY 1110 



RESULT 4 
A35844 

Xotch protein - African clawed frog 

C; Species: Xenopus laevis (African clawed frog) 

C;Date: 12-Oct-1990 #sequence_revision 12-Oct-1990 #text__change 02-Aug-2002 
C;Accession: A35844 

R;Coffman, C. ; Harris, W.; Kintner, C. 
Science 249, 1438-1441, 1990 

A;Title: Xotch, the Xenopus homolog of Drosophila notch. 
A;Reference number: A35844; MUID : 903852 8 5 ; PMID:2402639 
A; Access ion: A35844 

A; Status: preliminary; nucleic acid sequence not shown; not compared with 
conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-2524 <COF> 

C; Super family: notch protein; ankyrin repeat homology; EGF homology 

C; Keywords: transmembrane protein 

F; 14 6-17 7 /Domain : EGF homology <EGX1> 

F; 184-215/Domain: EGF homology <EGF1> 

F;222-254/Domain: EGF homology <EGF> 

F;456-487/Domain: EGF homology <EGX2> 

F;757-788/Domain: EGF homology <EGF3> 

F; 1025-1056/Domain: EGF homology <EGX3> 

F; 1924-1 956/Domain : ankyrin repeat homology <AN1> 

F; 1957-19 8 9/ Domain : ankyrin repeat homology <AN2> 

F; 1991-2 02 3/Domain : ankyrin repeat homology <AN3> 

F; 2 02 4 -2 05 6/ Domain : ankyrin repeat homology <AN4> 

F;2057-2089/Domain: ankyrin repeat homology <AN5> 



Query Match 15.4%; Score 1036; DB 2; Length 2524; 

Best Local Similarity 25.4%; Pred. No. 2.6e-47; 



Matches 326; Conservative 83; Mismatches 305; Indels 568; Gaps 



Qy 8 3 GEKTMYRR KSQC CP-GFYESGEMCVPHCADKCVHGR 117 

||:: I : I I i i I I : : : I : : I I : 

Db 54 GERCQFPNPCTIKNQCMNFGTCEPVLQGNAIDFICHCPVGF--TDKVCLTPVDNACWINP 111 

Qy 118 C IAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNP--IT 164 

I : I : I I I I I : I I I 1 I I I I I I 

Db 112 CRNGGTCELLNSVTEYKCRCPPGWTGDSCQQA DPCASN-PCANGGKCLPFEIQ 163 

Qy 165 GACHCAAGFRGWRCE DRCEQ GTYGNDCHQR CQ 196 

I I II II: : I I I : I I I i 

Db 164 YICKCPPGFHGATCKQDINECSQNPCKNGGQCINEFGSYRCTCQNRFTGRNCDEPYVPCN 223 

Qy 197 CQNGATC DHVTGECRCPPGYTGAFCED LC 22 5 

I I I I I I : : I I I I : : I I I : I 

Db 224 PSPCLNGGTCRQTDDTSYDCTCLPGFSGQNCEENIDDCPSNNCRNGGTCVDGVNTYNCQC 283 

Qy 226 PPGKHGPQCEQ RC PCQNGGVCHHVTG~~ECSCPSGWMGTVCGQ 266 

II II: I I I I I I M : I I I : I I I I : 

Db 284 PPDWTGQYCTEDVDECQLMPNACQNGGTCHNTYGGYNCVCVNGWTGEDCSENIDDCANAA 343 

Qy 267 PCPEGRFGKNC--SQEC QCHNGGTCDA— ATGQ--CHCSPG 301 

I I i I I I I I : I I I I : I I i I 

Db 344 CHSGATCHDRVAS FYCECPHGRTGLLCHLDNACI SNPCNEGSNCDTNPVNGKAICTCPPG 403 

Qy 3 02 YTGERCQ DECPVGTYGVLCAETCQCWGGKCYHVSGA--CLCEAGFAGERCEARLCP 3 56 

| | I I I I I : i I II : I : I : I I I : I I I I I : 

Db 4 04 YTGPACNNDVDECSLGAN PCERGGRCTNTLGS FQCNCPQGYAGPRCEI DV — 453 

Qy 357 EGLYGI KCDKRC PCHLENTHSCHPMSGE— CACKPGWSGLYC 396 

I II : I : I II 1111:1111 

Db 454 NECLSNPC--QNDSTCLDQIGEFQCICMPGYEGLYCETNIDECASNPCLHN 502 

Qy 397 NE TCSPGFYGEACQ QICS CQNGADC DSVTGK- 427 

M I III I ! I : I : I I I I : I I : 

Db 503 GKCIDKINEFRCDCPTGFSGNLCQHDFDECTSTPCKNGAKCLDGPNSYTCQCTEGFTGRH 562 

Qy 428 CTCAPGFKGIDC STP 442 

I I II : I I II 
Db 563 CEQDINECIPDPCHYGTCKDGIAT FTCLCRPGYTGRLCDNDINECLSKPCLNGGQCTDRE 622 

Qy 443 CPLGTYGINCSSR CG CKNDAVCSPVDG-SCTCKAGWHGVDCS1R 485 

: I I : : I II : I I I II : I : I I : I 

Db 623 NGYICTCPKGTTGVNCETKIDDCASNLCDNGKCIDKIDGYECTCEPGYTGKLCNININEC 682 

Qy 486 CPSGTWGFGC 495 

111:1 

Db 683 DSNPCRNGGTCKDQINGFTCVCPDGYHDHMCLSEVNECNSNPCIHGACHDGVNGYKCDCE 742 

Qy 496 NLTCQ CLNGGACNTLDGT — CTCAPGWRGEKCEL PC-Q 530 

II: I : I I I I : I Nihil: II 
Db 743 AGWSGSNCDINNNECESNPCMNGGTCKDMTGAYICTCKAGFSGPNCQTNINECSSNPCLN 802 

Qy 531 DGT YGLNC AERCD CSHADGCHPT TGHCRCLPGWS 564 

II III I : | : | : I I I I I I 

Db 8 03 HGTCIDDVAGYKCNCMLPYTGAICEAVLAPCAGSPCKNGGRCKESEDFETFSCECPPGWQ 8 62 



Qy 565 GVHCD SVCAEGRWGPNCSL PCYCKNGASC 593 

||: I I I I I : I I I I I I 

Db 8 63 GQTCEIDMNECVNRPCRNGATCQNTNGSYKCNCKPGYTGRNCEMDIDDCQPNPCHNGGSC 922 

Qy 594 SPDDGI CECAPGFRGTTCQR ICSPGFYGHRC 624 

I I II I I I I I I I : I I I I I I 

Db 923 S--DGINMFFCNCPAGFRGPKCEEDINECASNPCKNGANCTDCVNSYTCTCQPGFSGIHC 980 

Qy 625 SQTCPQCVHSS GPCHHITGL CDCLPGFTGALC NE 658 

I I II lllh 1111111:1 II 

Db 981 ESNTPDCTESSCFNGGTC--IDGINTFTCQCPPGFTGSYCQHDINECDSKPCLNGGTCQD 1038 



Qy 659 VCPSGRFGKNCAGI CTCTNNGTC NPIDRSCQCYPGWIGSDCSQP 702 

I I I I I I : I I I I I I I : I I I I I I 

Db 1039 SYGTYKCTCPQGYTGLNCQNLVRWCDSSPCKNGGKCWQTNNFYR-CECKSGWTGVYCDVP 1097 

Qy 7 03 CPPA — HWGPNCIHTCNCHNGAFC — SAYDGECKCTPGWTGLYCTQRCPLGFYGKDC 755 

II I : : I II I : I : I I : I I I I : : : I 

Db 1098 SVSCEVAAKQQGVDIVHL — CRNSGMCVDTGNTHFCRCQAGYTGS YCEEQV DEC 1149 



Qy 756 ALICQCQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

: I I I I I I I :: I II I : I : I : : 

Db 1150 S-PNPCQNGATCTDYLGGYSCECVAGYHGVNCSEEINECLSHPCQNGGTCIDLINTYKCS 1208 

Qy 7 89 CPSGTYGYGCRQICD CLNNSTC-DHITG-TCYCSPGWKGARCDQAG 8 32 

I I I I I I I I I I I I : I I I I I : I I I : 

Db 12 09 CPRGTQGVHCEINVDDCTPFYDSFTLEPKCFNNGKCIDRVGGYNCICPPGFVGERCE 1265 



Qy 833 VI I VGNLNS-LSRTSTALPADS 853 

I : : I I I III 

Db 1266 GDVNECLSN PCDS 1278 



RESULT 5 
A40043 

notch protein homolog TAN-1 precursor - human 
C; Species: Homo sapiens (man) 

C;Date: 21-Apr-1992 #sequence_revision 21~Apr-1992 #text_change 02-Aug~2002 
C; Accession : A4 004 3 

R;Ellisen, L.W.; Bird, J.; West, D.C.; Soreng, A.L.; Reynolds, T.C.; Smith, 

S.D. ; Sklar, J. 

Cell 66, 649-661, 1991 

A;Title: TAN-1, the human homolog of the Drosophila Notch gene, is broken by 
chromosomal translocations in T lymphoblastic neoplasms. 
A;Reference number: A40043; MUID: 91347367; PMID: 1831692 
A;Accession: A40043 

A; Status: preliminary; nucleic acid sequence not shown; not compared with 

conceptual translation 

A;Molecule type: mRNA 

A;Residues: 1-2555 <ELL> 

A; Cross-references : GB:M7398 0 

C; Superf amily : notch protein; ankyrin repeat homology; EGF homology 
F;261-292/Domain: EGF homology <EGX1> 
F ; 4 94-52 5/Domain : EGF homology <EGF1> 
F; 9 8 7-1018/Domain : EGF homology <EGX2> 
F; 1149-1180/Domain: EGF homology <EGF> 



F;1187-1218/Domain: 
F; 1233-1264/Domain: 
F;1927-1959/Domain: 
F; 1960-1992/Domain: 
F; 1994-2026/Domain: 
F;2027-2059/Domain: 
F;2060-2092/Dornain: 



EGF homology <EGF3> 

EGF homology <EGX3> 

ankyrin repeat homology <AN1> 

ankyrin repeat homology <AN2> 

ankyrin repeat homology <AN3> 

ankyrin repeat homology <AN4> 

ankyrin repeat homology <AN5> 



Query Match 15.3%; Score 1032; DB 2; 

Best Local Similarity 25.7%; Pred. No. 4.2e-47; 
Matches 315; Conservative 84; Mismatches 304, 



Length 2555; 
Indels 522; 



Gaps 73; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



94 CCPGFYESGEMCVPHCADKCVHGRC IAPNTCQCEPGWGGTMCSSACDGDH 143 

I I I I I : I : : I : I : I : I I I I I : I I 

89 CALGF--SGPLCLTPLDNACLTNPCRNGGTCDLLTLTEYKCRCPPGWSGKSCQQA 141 

144 WGPHCTSRCQCKNGALCNPITGA— CHCAAGFRGWRCE DRCEQG TYGNDCHQ- 193 

II III II : I I I III : I I : I I I 

142 — DPCASN-PCANGGQCLPFEASYICHCPPSFHGPTCRQDVNECGQKPRLCRHGGTCHNE 198 



194 — 



222 



-RC QCQNGATC DHVTGECRCPPGYTGAFCE- 

II I I I I I I I I I I I I I : I 1 I I 

199 VGSYRCVCRATHTGPNCERPYVPCSPSPCQNGGTCRPTGDVTHECACLPGFTGQNCEENI 2 58 

223 DLCPPG — KHGPQC EQRCP CQNGGVCHHVTG- 251 

III I : I I II I I I II I : I 

25 9 DDCPGNNCKNGGACVDGA/NTYNCPCPPEWTGQYCTEDVDECQLMPNACQNGGTCKNTHGG 318 

252 -ECSCPSGWMGTVCGQ PCPEGRFGKNC--SQEC — 2 81 

I I : I I I I : I I I I I I : I 

319 YNCVCVNGWTGEDCSENIDDCASAACFHGATCHDRVASFYCECPHGRTGLLCHLNDACIS 37 8 

2 82 -QCHNGGTCDA--ATGQ — CHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCY 333 

1:111 I : I I I I I I I I I I : I I : I I I 

37 9 NPCNEGSNCDTNPVNGKAICTCPSGYTGPACSQDVDECSLGAN PCEHAGKCI 4 30 

334 HVS GA- - CLCEAGFAGERCEARLC PEGL YGI KCDKRC PCHLENTHSCHPMSGE— CA 38 6 

: I : I I I : I I I I : I I I : I : I III 

431 NTLGSFECQCLQGYTGPRCEIDV NECVSNPC — QNDATCLDQIGEFQCM 477 

387 CKPGWSGLYC NE TCSPGFYGEACQQI--CS CQ 416 

I I I : I : : I II I I I I I I : I : I : 

47 8 CMPGYEGVHCEVNTDECASSPCLHNGRCLDKINEFQCECPTGFTGHLCQDVDECASTPCK 5 37 

417 NGADC DSV-TGKCTCAPGFKGI 437 

I I I I I I I I I I I : I 

53 8 NGAKCLDGPNTYTCVCTEGYTGTHCEVDIDECDPDPCHYGSCKDGVATFTCLCRPGYTGH 5 97 

438 DCST PCPL GTYGINCS SRCGCKNDAVCSP 466 

II III I I I I I : I : 

598 HCETNINECSSQPCRLRGTCQDPDNAYLCFCLKGTTGPNCEINLDDCASSPCDSGTCLDK 657 

4 67 VDG-SCTCKAGWHGVDCSIR CPSGTWGFGCNL TC 4 99 

: I I I I : I : I I : I I I I I II 

658 IDGYECACEPGYTGSMCNSNIDECAGNPCHNGGTCEDGINGFTCRCPEGYHDPTCLSEVN 717 



Qy 



500 



QCLNGGACNTLDG-TCTCAPGWRGEKCEL' 



P 528 



I : : I : : I : i I I I I I I I : : 

Db 718 ECNSNPCVHGACRDSLNGYKCDCDPGWSGTNCDINNNECESNPCVNGGTCKDMTSGIVCT 777 

Qy 529 CQDGTYGLNCAERCD CSHADGC-HPTTGH-CRCLPGWSGVHCDSV CAEG — 575 

I : : I I I I : I : I I : I I I : : I I : I II 

Db 778 CREGFSGPNCQTNINECASNPCLNKGTCIDDVAGYKCNCLLPYTGATCEWLAPCAPSPC 837 

Qy 576 RWGPNC SLPCYC KNGASCSPDDGICECAPGFRGTTCQRI CSP 617 

III III I : I I I : I I : I I I 

Db 838 RNGGECRQSEDYESFSCVCPTAGAKGQTCEVDINECVLSPCRHGASCQNTHGXYRCHCQA 8 97 

Qy 618 GFYGHRCSQTCPQCVHSSGPCHH ITGLCDCLPGFTGALCNE 658 

I : I I I III: I I I I I I I I I I I 

Db 898 GYSGRNCETDIDDC — RPNPCHNGGSCTDGINTAFCDCLPGFRGTFCEEDINECASDPCR 955 

Qy 659 VCPSGRFGKNCAG ICT CTNNGTCNPIDR SCQCYPGW 694 

I I : I I : I 1111111:1 : I I I I : 

Db 956 NGANCTDCVDSYTCTCPAGFSGIHCENNTPDCTESSCFNGGTC-- VDGINSFTCLCPPGF 1013 

Qy 695 IGSDC SQP CPPAHWGPNC IHTCN CHNGAF 723 

III I : I I I : I I I I : I I : I I I 

Db 1014 TGSYCQHWNECDSRPCLLGGTCQDGRGLHRCTCPQGYTGPNCQNLVHWCDSSPCKNGGK 1073 

Qy 724 C SAYDGECKCTPGWTGLYCTQ 744 

I : I I : I I I I I I I I 

Db 1074 CWQTHTQY--RCECPSGWTGLYCDVPSVSCEVAAQRQGVDVARLCQHGGLCVDAGNTHHC 1131 

Qy 745 RCPLGFYGKDCALI CQ CQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

II I : I I : I I I I I I I I : : I II | : | : | : : 

Db 1132 RCQAGYTGSYCEDLVDECSPSPCQNGATCTDYLGGYSCKCVAGYHGVNCSEEIDECLSHP 1191 

Qy 789 CPSGTYGYGCRQICD CLNNSTC-DHITG- 815 

II I I I I I I I I I I I : I 

Db 1192 CQNGGTCLDLPNTYKCSCPRGTQGVHCEINVDDCNPPVDPVSRSPKCFNNGTCVDQVGGY 1251 

Qy 816 T C YC S P GWKGARC DQ AGVI I VGN LN 84 0 

: I I I I : I I I : I : : I 

Db 1252 SCTCPPGFVGERCE GDVN 1269 



RESULT 6 
A46019 

notch-1 protein - mouse 

N; Alternate names: motch protein 

C; Species: Mus musculus (house mouse) 

C;Date: 22-Sep-1993 #sequence_revision 18-Nov-1994 #text_change 07-Mar-2003 

C;Accession: A46019; S25144; C49175; B46438; A46438; PH1569; S32109 

R;del Amo, F.F.; Gendron-Maguire, M. ; Swiatek, P.J.; Jenkins, N.A. ; Copeland, 

N.G.; Gridley, T. 

Genomics 15, 259-264, 1993 

A;Title: Cloning, analysis, and chromosomal localization of Notch-1, a mouse 
homolog of Drosophila Notch. 

A;Reference number: A46019; MUID : 93 194 17 0 ; PMID:8449489 
A; Accession : A4 6019 

A; Status: not compared with conceptual translation 
A;Molecule type: nucleic acid 
A; Residues: 1-2531 <DEL> 



A; Cross-references: GB:Z11886; GB:S47228; NID:g288502; PIDN : CAA7794 1 . 1 ; 
PID:g288503 

A;Note: sequence extracted from NCBI backbone (NCBIP : 127318 ) 
R;Franco del Amo, F. ; Smith, D.E.; Swiatek, P.J.; Gendron-Maguire, M. ; 
Greenspan, R.J.; McMahon, A. P.; Gridley, T. 
submitted to the EMBL Data Library, April 1992 

A; Description : Expression pattern of Motch, a mouse homolog of Drosophila Notch, 
suggests an important role in early pos timplantation mouse development. 
A;Reference number: S25144 
A; Access ion: S2514 4 
A;Molecule type: mRNA 

A;Residues: 1551-2108, 'Q ', 2110-2114 , 'ALP \ 2118-2170 <FRA> 

A;Cross-references : EMBL : Z 11 8 8 6 

R;Lardelli, M. ; Lendahl, U. 

Exp. Cell Res. 204, 364-372, 1993 

A; Title: Motch A and Motch B — two mouse Notch homologues coexpressed in a wide 
variety of tissues. 

A;Reference number: A49175; MUID : 93 17 8 563 ; PMID: 8440332 
A; Accession : C4 917 5 

A; Status: preliminary; nucleic acid sequence not shown 

A; Molecule type: mRNA 

A; Residues: 1161-1547 <LAR> 

A/Cross-references: EMBL:X68278; NID : g2 87 987 ; PIDN : CAA4 8339 . 1 ; PID:g287988 
A; Experimental source: embryo 

A;Note: sequence extracted from NCBI backbone (NCBI P : 12 615 9 ) 

R;Kopan, R. ; Weintraub, H. 

J. Cell Biol. 121, 631-641, 1993 

A;Title: Mouse notch: expression in hair follicles correlates with cell fate 
determination . 

A; Reference number: A46438; MUID : 93252 998 ; PMID: 8486742 
A; Accession: B4 64 3 8 
A; Status : preliminary 
A;Molecule type: nucleic acid 

A; Residues: 18 65-1932, ' RR 1935-1937 , f L 1 , 1938-1967, 'I ',19 69-2 04 4, 'IE f ,2 047- 
2 0 52, 'S* ,2054-2056, 1 S I RRE ' ,2062-2075 <KOP> 
A; Experimental source: embryo 

A;Note: sequence extracted from NCBI backbone (NCBIN : 13 12 4 6 , NCBI P : 1312 4 7 ) 

C; Comment: This protein has many EGF repeats and lin-12 [ #1172 ] /Notch repeats. 

C; Comment: This protein is one of the neurogenic proteins controlling the 

decision between ectodermal and neural fate for cells in the early embryo. 

C; Genetics : 

A; Gene: notch- 1 

A; Map position: 2 

A; Note: proximal region of chromosome 2 

C; Superf amily : notch protein; ankyrin repeat homology; EGF homology 

F; 106-13 8 /Domain : EGF homology <EGF1> 

F; 14 4 -17 5 /Domain : EGF homology <EG01> 

F; 222-2 54 /Domain : EGF homology <EGF2> 

F;261-292/Domain: EGF homology <EG02> 

F; 339-370 /Domain : EGF homology <EG03> 

F; 4 16-4 4 9 /Domain : EGF homology <EGF3> 

F; 4 5 6-4 8 7 /Domain : EGF homology <EG04> 

F; 4 94- 52 5/ Domain : EGF homology <EG05> 

F; 5 32 -5 63 /Domain : EGF homology <EG06> 

F ; 607-638 /Domain : EGF homology <EG07> 

F; 682-7 13 /Domain : EGF homology <EG08> 

F; 7 57 -7 8 8 /Domain : EGF homology <EG09> 



F; 7 95-82 6/ Domain : EGF homology <EG10> 
F; 8 7 3- 9 04 /Domain : EGF homology <EG11> 
F; 9 11-9 4 2 /Domain : EGF homology <EG12> 
F; 949-980/Domain: EGF homology <EG13> 
F; 987-1018/Domain: EGF homology <EG14> 



F; 1025-1056/Domain 
F; 1063-1094/Domain 
F; 1149-1180/Domain 
F; 1187-1218/Domain 
F; 1233-1264/Domain 
F; 1352-1383/Domain 
F; 1391-1425/Domain 
F; 1917-1948/Domain 
F; 1949-1981/Domain 
F; 1983-2015/Dornain 
F; 2 01 6-2 04 8 /Domain 
F;2049-2081/Domain 



EGF homology <EG15> 

EGF homology <EG16> 

EGF homology <EG17> 

EGF homology <EG18> 

EGF homology <EGF4> 

EGF homology <EG19> 

EGF homology <EGF> 

ankyrin repeat homology <AN1> 

ankyrin repeat homology <AN2> 

ankyrin repeat homology <AN3> 

ankyrin repeat homology <AN4> 

ankyrin repeat homology <AN5> 



Query Match 15.2%; Score 1028; DB 2; Length 2531; 

Best Local Similarity 25.7%; Pred. No. 6.8e-47; 

Matches 314; Conservative 83; Mismatches 286; Indels 538; Gaps 73; 

Qy 8 6 TMYRRKSQCCPGFYESGEMCV PHCADKCVH-GRCI APNTCQCEPGWGGTNCSS- 137 

I : I : I I I : I I : I I : : I : I : I : : I : I I I : I I 

Db 121 TLTEYKCRCSPGW — SGKSCQQADPCASNPCANGGQCLPFESSYICRCPPGFHGPTCRQD 17 8 

Qy 138 ACDGDHWGPHC TSRCQCKNGALCNP 162 

II I I I I I I I : I II I I 

Db 179 VNECSQNPGLCRHGGHCHNEI GS YRCACCATHTGPHCELP YVPCS PS PCQNGATCRPTGD 238 

Qy 163 ITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATC-DHV-TGECRCPPGYTGAF 22 0 

I I I I I I I I : : II I : I I I I I I I II I I I I : 

Db 239 TTHECACLPGFAGQNCEENVD DCPGN-NCKNGGACVDGVNTYNCRCPPEVTGQY 291 

Qy 221 C-EDLCPPGKHGPQCEQRCP — CQNGGVCHHVTG — ECSCPSGWMGTVCGQ 266 

III: : I I I I I I I I I : I I I Ml I I : 

Db 292 CTEDV DEC-QLMPNACQNAGTCHNTHGGYNCVCVNGWTGEDCSENIDDCASAA 343 

Qy 267 PCPEGRFGKNC — SQEC QCHNGGTCDA--ATGQ — CHCSPG 301 

I I I I I I I I : I I I I : I I I 

Db 34 4 CFQGATCHDRVASFYCECPHGRTGLLCHLKHACISNPCNEGSNCDTNPVNGKRICTCPSG 4 03 

Qy 302 YTGERCQ DECPVGTYGVLCAETCQCVNGGKCYHVSGA — CLCEAGFAGERCEARLCP 35 6 

I II I 1 1 I : I : I : I I I : I : I I I : I I I : 

Db 4 04 YTGPACSQDVDECDLGAN RCEHAGKCLNTLGSFECQCLQGYTGPGCEIDV-- 4 53 

Qy 357 EGLYGIKCDKRC PCHLENTHSCHPMSGE — CACKPGWSGLYCNET 399 

I | | : | : | 111111:1:11 
Db 454 NECISNPC — QNDATCLDQIGEFQCICMPGYEGVYCEINTDECASSPCLHN 502 

Qy 400 CSPGFYGEACQ QICS CQNGADC--DSVTGKCTCAPGFKGID 438 

I I I I I I I : I : I I I I I I I I : I 

Db 503 GHCMDKIHEFQCQCPKGFNGHLCQYDVDECASTPCKNGAKCLDGPNTYTCVCTEGYTGTH 5 62 

Qy 439 CST PCPLGT YGINCSSRCG CKNDAVCSPVD 4 68 

I III: I : I : I : : I I 

Db 563 CEVDIDECDPDPCHYGSCKDGVAT FTCLCQPGYTGHHCETNINECHSQPCRHGGTCQDRD 622 



Qy 469 GS — CTCKAGWHGVDCSIR CPSGTWGFGCNLTCQCLNGGACNTLDG-TCTCA 517 

11111:11 I I I I II : : | | | | 

Db 623 NSYLCLCLKGTTGPNCEINLDDCASNPCDSGT CL DKIDGYECACE 667 

Qy 518 PGWRGEKCEL PCQDGTYGLNCAERCDCSHADGCHPTT 554 

I I : I I : I : I I I I I I : I I I 

Db 668 PGYTGSMCNVNIDECAGSPCHNGGTCEDGIAGFTC — RC PEGYHDPTCLSEVNECN 721 

Qy 555 GHCR CLPGWSGVHCD SVCAE 574 

III 111111:11 II 

Db 722 SNPCIHGACRDGLNGYKCDCAP GWSGTNCDINNNECESNPCVKGGTCKDMTSGYVCTCRE 7 81 

Qy 575 GRWGPNC SLPCY CKNG 590 

I I I I I III I I I 

Db 7 82 GFSGPNCQTNINECASNPCLNQGTCIDDVAGYKCNCPLPYTGATCEWLAPCATSPCKNS 841 

Qy 591 ASCSPDDGI CECAPGFRGTTCQ RICSPGFYG 621 

I : | | | : : | | | : : I I : I 

Db 842 GVCKESEDYESFSCVCPTGWQGQTCEVDINECVKSPCRHGASCQNTNGSYRCLCQAGYTG 901 

Qy 622 HRCSQTCPQCVHSSGPCHH ITGLCDCLPGFTGALCNE 658 

I I III: I I I I I II I I I I I 

Db 902 RNCESDIDDC — RPNPCHNGGSCTDGINTAFCDCLPGFQGAFCEEDINECASNPCQNGAN 959 

Qy 659 VCPSGRFGKNCAG ICT CTNNGTCNPIDR SCQCYPGWIGSD 698 

II I I : I II Mill : I : I I I I : I I 

Db 960 CTDCVDSYTCTCPVGFNGIHCENNTPDCTESSCFNGGTC— VDGINSFTCLCPPGFTGSY 1017 

Qy 699 C SQP CPPAHWGPNC IHTCN CHNGAFC 724 

I I : I I I : I I I : I : I I I I 

Db 1018 CQYDVNECDSRPCLHGGTCQDSYGTYKCTCPQGYTGLNCQNLVRWCDSAPCKNGGRCWQT 1077 

Qy 725 -SAYDGECKCTPGWTGLYC TQRCPLGFY--GKDCALICQ 7 60 

: I hi II I I : I : I : II I : i I 

Db 1078 NTQY — HCECRSGWTGVNCDVLSVSCEVAAQKRGIDVTLLCQHGGLCVDEGDKHYCHCQA 1135 

Qy 761 CQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

I I I I I I I :: I II | : | : | : : 
Db 1136 GYTGSYCEDEVDECSPNPCQNGATCTDYLGGFSCKCVAGYHGSNCSEEINECLSQPCQNG 1195 

Qy 789 CPSGTYGYGCRQI CD CLNNSTC-DHITG-TCYC 819 

I I I I I I I 111111:1111 

Db 1196 GTCIDLTNSYKCSCPRGTQGVHCEINVDDCHPPLDPASRSPKCFNNGTCVDQVGGYTCTC 1255 

Qy 820 S P GWKGARC DQAGVI I VGN LN 84 0 

I I : I I I : I : : I 

Db 1256 PPGFVGERCE GDVN 1269 



RESULT 7 
S18188 

notch protein homolog - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 19-Feb-1994 #sequence_revision 10-Nov-1995 #text_change 02-Aug-2002 
C;Accession: S18188 

R; Weinmaster , G.; Roberts, V.J.; Lemke, G. 



Development 113, 199-205, 1991 

A; Title: A homolog of Drosophila Notch expressed during mammalian development. 
A;Reference number: S18188; MUID : 92111383 ; PMID:1764995 
A; Accession: S1818 8 
A;Molecule type: mRNA 
A; Residues: 1-2531 <WEI> 

A;Cross-references : EMBL:X57405; NID:g57634; PID:g57635 

C ; Superf amily : notch protein; ankyrin repeat homology; EGF homology 

F; 987-101 8/Domain : EGF homology <EGF1> 

F; 1025-1056/Domain: EGF homology <EGF> 

F; 1233-1264/Domain : EGF homology <EGF2> 

F; 1917-1949/Domain : ankyrin repeat homology <AN1> 

F; 1950-1982/Domain : ankyrin repeat homology <AN2> 

F; 1984-2016/Domain : ankyrin repeat homology <AN3> 

F; 2017-2049/Domain: ankyrin repeat homology <AN4> 

F; 2050-2082/Domain : ankyrin repeat homology <AN5> 

Query Match 15.2%; Score 1024; DB 2; Length 2531; 

Best Local Similarity 25.8%; Pred. No. l.le-46; 

Matches 315; Conservative 83; Mismatches 286; Indels 536; Gaps 74 

Qy 86 TMYRRKSQCCPGFYESGEMCV PHCADKCVH-GRCI APNTCQCEPGWGGTNCSS- 137 

I : I : I I I : I I : I I : : I : I : I : : MM: I 

Db 121 TLTEYKCRCPPGW — SGKSCQQADPCASNPCANGGQCLPFESSY1CGCPPGFHGPTCRQD 17 8 

Qy 138 ACDGDHWGPHC TSRCQCKNGALCNP 162 

II I I I I I I I : I I I I 

Db 179 VNECSQNPGLCRHGGTCHNE1GSYRCACRATHTGPHCELPYVPCSPSPCQNGGTCRPTGD 238 

Qy 163 ITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATC-DHV-TGECRCPPGYTGAF 22 0 

I I I I I I I I : : II I : I I I I I I I I I II : I I : 

Db 239 TTHECACLPGFAGQNCEENVD DCPGN-NCKNGGACVDGVNTYNCRCPPEWTGQY 291 

Qy 221 C-EDLCPPGKHGPQCEQRCP — CQNGGVCHHVTG--ECSCPSGWMGTVCGQ 266 

I I I : : I I I I I I I I I : I I I : I I I I 

Db 292 CTEDV DEC-QLMPNACQNAGTCHNSHGGYNCVCVNGWTGEDCSDN1DDCASAA 343 

Qy 267 PCPEGRFGKNC — SQEC QCHNGGTCDA — ATGQ — CHCSPG 301 

I I I I I I : I hill I : ! I I 

Db 344 CFQGATCHDRVASFYCECPHGRTGLLCHLNDACISNPCNEGSNCDTNPVNGKAICTCPRG 403 

Qy 302 YTGERCQ DECPVGTYGVLCAETCQCVNGGKCYHVSGA--CLCEAGFAGERCEARLCP 356 

III I I I I : I I : I I I : I : II I : I I I I : 

Db 4 04 YT G P AC S Q DVD E C AL G AN PCEHAGKCLNTLGS FECQCLQGYTGPRCEI DV — 4 53 

Qy 357 EGLYGIKCDKRC PCHLENTHSCHPMSGE--CACKPGWSGLYC 396 

I I ! : I : I II I I I I : I : I I 

Db 454 NECISNPC--QNDATCLDQIGEFQCICMPGYEGVYCEINTDECASSPCLHN 502 

Qy 397 NE TCSPGFYGEACQ QICS CQNGADC--DSVTGKCTCAPGFKGID 438 

II I I I I I I I : I : I I I I I I I I : I 

Db 503 GRCVDKINEFLCQCPKGFSGHLCQYDVDECASTPCKNGAKCLDGPNTYTCVCTEGYTGTH 562 

Qy 439 CST PCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGW 477 

I I I : I MM: M I : I : 

Db 563 CEVDIDECDPDPCHIGL CK-DGVAT FTCLCQPGYTGHHCETNINECH 608 



Qy 478 HGVDCSIR CPSGTWGFGCNLTCQ CLNGGACNTLDG-TCTCAP 518 

I I I I I I I I I : I : I : : M | | | 

Db 609 SQPCRHGGTCQDRDNYYLCLCLKGTTGPNCEINLDDCASNPCDSGTCLDKIDGYECACEP 668 

QY 519 GWRGEKCEL PCQDGTYGLNCAERCDCSHADGCHPTT 554 

I : I I : I : II I I I I : I I I 

Db 669 GYTGSMCNVNIDECAGSPCHNGGTCEDGIAGFTC— RC PEGYHDPTCLSEWECNS 722 

Qy 555 GHCR CLPGWSGVHCD SVCAEG 575 

III I I I I I I : I I ill 

Db 723 NPCIHGACRDGLNGYKCDCAPGWSGTNCDINNNECESNPCVNGGTCKDMTSGYVCTCREG 782 

Qy 576 RWGPNC SLPCY CKNGA 591 

I I I I III | | | 

Db 783 FSGPNCQTNINECASNPCLNQGTCIDDVAGYKCNCPLPYTGATCEWLAPCATSPCKNSG 8 42 

Qy 592 SCSPDDGI CECAPGFRGTTCQ RICS PGFYGH 622 

I : I I I : : I II : : I I : I 

Db 843 VCKESEDYESFSCVCPTGWQGQTCEIDINECVKSPCRHGASCQNTNGSYRCLCQAGYTGR 902 

Qy 623 RCSQTCPQCVHSSGPCHH ITGLCDCLPGFTGALCNE 658 

I I Ml: I I I I II I II I I 

Db 903 NCESDIDDC— RPNPCHNGGSCTDGVNAAFCDCLPGFQGAFCEEDINECATNPCQNGANC 960 

Qy 659 VC P S GRFGKNCAG ICT CTNNGTCNPIDR SCQCYPGWIGSDC 699 

i I : I I : I II I I I M : I : I I I I : I I I 

Db 961 TDCVDSYTCTCPTGFNGIHCENNTPDCTESSCFNGGTC — VDGINSFTCLCPPGFTGSYC 1018 

Qy 700 SQP CPPAHWGPNC IHTCN CHNGAFC 724 

I : I I I : I II : I : I II I 

Db 1019 QYDVNECDSRPCLHGGTCQDSYGTYKCTCPQGYTGLNCQNLVRWCDSAPCKNGGKCWQTN 1078 

Qy 725 SAYDGECKCTPGWTGLYC TQRCPLGFY— GKDCALICQ 760 

: I 1:1 I I I I I : I : I I I : M 

Db 107 9 TQY--HCECRSGWTGFNCDVLSVSCEVAAQKRGIDVTLLCQHGGLCVDEEDKHYCHCQAG 1136 

Qy 761 CQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

I II I I I I :: I II | : | : | : : 
Db 1137 YTGSYCEDEVDECSPNPCQNGATCTDYLGGFSCKCVAGYHGSNCSEEINECLSQPCQNGG 1196 

Qy 789 CPSGTYGYGCRQICD CLNNSTC-DHITG-TCYCS 820 

I : I I I I 

Db 1197 TCIDLTNTYKCSCPRGTQGVHCEINVDDCHPPLDPASRSPKCFNNGTCVDQVGGYTCTCP 1256 

Qy 821 PGWKGARCDQAGVI IVGNLN 84 0 

11:111: I : : I 

Db 1257 PGFVGERCE GDVN 1269 



RESULT 8 

A49128 

cell-fate determining gene Notch2 protein - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 21~Jan-1994 #sequence_revi s ion 18-Nov-1994 #text_change 02-Aug-2002 
C;Accession: A49128 

R; Weinmaster , G. ; Roberts, V.J.; Lemke, G. 
Development 116, 931-941, 1992 



A; Title: Notch2 : a second mammalian Notch gene. 
A/Reference number: A49128; MUID : 93202015 ; PMID:1295745 
A; Access ion: A4 912 8 

A; Status: preliminary; not compared with conceptual translation 

A; Molecule type: mRNA 

A;Residues: 1-2471 <WEI> 

A; Experimental source: Schwann cell 

A;Note: sequence extracted from NCBI backbone (NCBIP : 127811 ) 

C; Superf amily : notch protein; ankyrin repeat homology; EGF homology 

F; 2 64 -2 9 5/ Domain : EGF homology <EGX1> 

F; 7 9 9- 8 3 0/ Domain : EGF homology <EGF1> 

F; 8 7 7-9 08 /Domain : EGF homology <EGX2> 

F; 102 9-1 060 /Domain : EGF homology <EGF> 

F; 1067-1098/Domain: EGF homology <EGX3> 

F; 1153- 1 184/Domain : EGF homology <EGF3> 

F; 1191-12 2 2/ Domain : EGF homology <EGX4> 

F; 187 6-1908/ Domain : ankyrin repeat homology <AN1> 

F; 1909-1941/ Domain : ankyrin repeat homology <AN2> 

F; 1943-1975/Domain : ankyrin repeat homology <AN3> 

F; 197 6-2 008/Domain : ankyrin repeat homology <AN4> 

F;2009-2041/Domain: ankyrin repeat homology <AN5> 

Query Match 14.8%; Score 998; DB 2; Length 2471; 

Best Local Similarity 24.4%; Pred. No. 2.6e-45; 

Matches 321; Conservative 79; Mismatches 322; Indels 596; Gaps 70; 

CHWIGTASPLNLEDPNVCSHWES YSVTVQESY PHPFDQI YYTSCTDIL 64 

II: | : : | | | : | | : : : | : : 

CHMLS WDTYECTCQVGFTGKQCQWTDVCLSHPCEN — GSTCSSVA 163 



: I I : I : I : : I I I II 

-RCPAGI--TGQKCDADINECDIPGRCQHGGTC 198 



Qy 


17 


Db 


121 


Qy 


65 


Db 


164 


Qy 


125 


Db 


199 


Qy 


176 


Db 


255 


Qy 


233 


Db 


301 


Qy 


268 


Db 


361 


Qy 


313 


Db 


421 


Qy 


352 


Db 


474 



1PGWGGTNCSSACDGDHWGPHCTS RCQCKNGALC NPITGACHCAAGFRG 17 5 

II I I I I I I I I I I I I I I I I I 



II : I I I I I I I I I I I I I : I I I I I I : 
DCPNH-KCQNGGVCVDGVNTYNCRCPPQWTGQFCTEDV D 300 



267 



I I MINI: I I I : I 



1111:11 E I I I I I I I I I III II 



3VLCAETCQCVNGGKCYHVSGA — CLCEAGFAGERCE 351 

I : I : | I I : M I I I : II I I I 
ANSNPCEHAGKCVNTDGAFHCECLKGYAGPRCEMDINECHSDPCQNDATCLD 47 3 

-ARLCPEGLYGIKCDKR CP CHLE- 373 

I I I I : I : II I : : 



Qy 374 NTHSC--HPMSGECACKPGWSGLYCNET 399 

I I II III I : : I I : I 
Db 534 DDCSSTPCLNGAKCIDHPNGYECQCATGFTGTLCDENIDNCDPDPCHHGQCQDGIDSYTC 593 

Qy 400 -CSPGFYGEAC-QQICSC QNGADCDSVTG-KCTCAPGFKGIDC STP 442 

hlhlllll : I I I I : I I I I I : : I | I 

Db 594 ICNPGYMGAICSDQIDECYSSPCLNDGRCIDLVNGYQCNCQPGTSGLNCEINFDDCASNP 653 

Qy 443 CPLGTY--GIN CS SRCG CKNDAVC 464 

II I I I II II I : I I I 

Db 654 CLHGACVDGINRYSCVCSPGFTGQRCNIDIDECASNPCRKDATCINDVNGFRCMCPEGPH 713 

Qy 465 SP-VDGSCT CKAGWHGVDCSI 484 

II : I : I I I I 1 I I : : I : 

Db 714 HPSCYSQVNECLSSPCIHGNCTGGLSGYKCLCDAGWVGINCEVDKNECLSNPCQNGGTCN 773 

Qy 4 85 RCPSGTWGFGCNLTCQ CLNGGAC 507 

I I I : I : I I I I I 

Db 77 4 NLVNGYRCTCKKGFKGYNCQVNIDECASNPCLNQGTCLDDVSGYTCHCMLPYTGKNCQTV 833 

Qy 50 8 NTLDGTCT CAP GWRGEKCELPCQDGTYGLNCAERCDC SHAD 54 8 

I I I I I | | | : | : : | : : | : | : 

Db 834 LAPCSPNPCENAAVCKEAPNFESFTCLCAPGWQGQRCTVDVDE CVSK-PCMNNG 886 

Qy 54 9 GCHPTTGH — CRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASC — SPDDGICECAP 604 

I I I I | | | | : | | : | : | | I : I I I I : 

Db 887 ICHNTQGSYMCECPPGFSGMDCEE DINDCLANPCQNGGSCVDKVNTFSCLCLP 939 

Qy 605 GFRGTTCQR ICSPGFYGHRCSQTCPQCVHSS 635 

I I I I I I I I : I I : I I I 

Db 940 GFVGDKCQTDMNECLSEPCKNGGTCSDYVNSYTCTCPAGFHGVHCENNIDECTESSCFNG 999 

Qy 636 GPCHHITGL CDCLPGFTGALC NE VCPSGRFG 666 

I I : I : I I I I I I I II III I 

Db 1000 GTC--VDGINSFSCLCPVGFTGPFCLHDINECSSNPCLNSGTCVDGLGTYRCTCPLGYTG 1057 

Qy 667 KNC AGICT CTNNGTC--NPIDRSCQCYPGWIGSDCSQ 701 

III : I : I I I I I I I I I I I : I 

Db 1058 KNCQTLVNLCSPSPCKNKGTCAQEKARPRCLCPPGWDGAYCDVLNVSCKAAALQKGVPVE 1117 

Qy 702 PCPPAHWGPNC IHTC NCHNGAFCSAYDG--ECKCTP 735 

II : I I : I I : I I I I : I I : I I 

Db 1118 HLCQHSGICINAGNTHHCQCPLGYTGSYCEEQLDECASNPCQHGATCSDFIGGYRCECVP 1177 

Qy 736 GWTGLYCTQR CPLGFYG KDCALICQCQN 763 

I : I : I I I I 1 I I I I I 

Db 1178 GYQGVNCEYEVDECQNQPCQNGGTCIDLVNHFKCSCPPGTRGLLCEENIDDCAGAPHCLN 1237 

Qy 7 64 GADC-DHISG QCTCRTGFM 7 81 

I I I I I I ! I I : I 

Db 1238 GGQCVDRIGGYSCRCLPGFAGERCEGDINECLSNPCSSEGSLDCIQLKNNYQCVCRSAFT 1297 

Qy 7 82 GRHCE QKCPSGTYGYGCRQI CDCLNNSTCDHITGT CYCS PGWKGARCDQA 831 

I I I I I II I I 1 I I : I I I I : I I I I : 

Db 1298 GRHCETFLDVCPQK PCLNGGTCAVASNVPDGFICRCPPGFSGARCQSS 1345 



RESULT 9 
S42612 

transmembrane protein precursor - zebra fish 
C;Species: Brachydanio rerio (zebra fish) 

C;Date: 20-Feb-1995 #s equence_revis ion 20-Feb-1995 #text_change 02-Aug~2002 
C; Accession : S42 612 

R;Bierkamp, C; Campos-Ortega, J. A. 
Mech. Dev. 43, 87-100, 1993 

A;Title: A zebrafish homologue of the Drosophila neurogenic gene Notch and its 

pattern of transcription during early embryogenesis . 

A;Reference number: S42612; MUID : 94 12 8 602 ; PMID:8297791 

A; Access ion: S 42 612 

A; Status: preliminary 

A;Molecule type: mRNA 

A;Residues: 1-2437 <BIE> 

A;Cross-references: EMBL:X69088; NiD:g433866; PIDN : CAA48 831 . 1 ; PID:g433867 

C ; Super family : notch protein; ankyrin repeat homology; EGF homology 

F; 75 5-7 8 6/ Domain : EGF homology <EGF1> 

F; 102 3- 1054 /Domain : EGF homology <EGF> 

F; 1185-12 16/ Domain : EGF homology <EGF2> 

F; 19 15- 194 7 /Domain : ankyrin repeat homology <ANl> 

F; 194 8-198 0/Domain : ankyrin repeat homology <AN2> 

F; 1982-2 014/Domain : ankyrin repeat homology <AN3> 

F; 2 015-2 04 7 /Domain : ankyrin repeat homology <AN4> 

F;2048-2080/Domain: ankyrin repeat homology <AN5> 

Query Match 14.6%; Score 987; DB 2; Length 2437; 

Best Local Similarity 24.8%; Pred. No. 9.7e-45; 

Matches 310; Conservative 81; Mismatches 320; Indels 538; Gaps 70; 



Qy 


91 


KSQCCPGFYESGEMCVPHCADKCVHGRC IAPNTCQCEPGWGGTNCSSACD 

1 1 1 1 1 : 1 : 1 : : 1 : 1 1 : 1 : 1 II 1 1 1 
KCDCVLGF — SDRLCLTPVNHACMNSPCRNGGTCSLLTLDTFTCRCQPGWSGKTCQLA — 


140 


Db 


85 


140 


Qy 


141 


GDHWGPHCTSRCQCKNGALCNPITG — ACHCAAGFRGWRCEDRCEQGTYGNDCH-QRCQC 
II III 1 : 1 1 1 1 1 1:1 1 


197 


Db 


141 


DPCASN-PCANGGQCSAFESHYICTCPPNFHGQTCRQDV NECAVSPSPC 


188 


Qy 


198 


QNGATCDHVTGE--CRCPPGYTGAFCEDL CPPGK 

: 1 1 II : 1 1 1 1 1 1 1 1 1 1 : 1 III 
RNGGTCINEVGSYLCRCPPEYTGPHCQRLYQPCLPSPCRSGGTCVQTSDTTHTCSCLPGF 


229 


Db 


189 


248 


Qy 


230 


HGPQCE QRC PCQNG 

1 1 1 II 1 1 1 1 
TGQTCEHNVDDCTQHACENGGPCIDGINTYNCHCDKHWTGQYCTEDVDECELSPNACQNG 


243 


Db 


249 


308 


Qy 


244 


GVCHHVTG — ECSCPSGWMGTVCGQ PCPEGRFGKN 

1 1 1 : 1 | | : | | | | : 1 1 1 1 1 
GTCHNTIGGFHCVCVNGWTGDDCSENIDDCASAACSHGATCHDRVASFFCECPHGRTGLL 


276 


Db 


309 


368 


Qy 


277 


C--SQEC QCHNGGTCDA — AT GQ — CHCSPGYTGERCQ DEC P VGT YG VL CAE T C 

1 1 1 1 II : 1 : 1 1 1 1 1 1 1 1 1 1 1 : 1 

CHLDDACI SNPCQKGSNCDTNPVSGKAICTCPPGYTGSACNQDIDECSLGAN 


324 


Db 


369 


420 


Qy 


325 


QCVNGGKCYHVSGA— CLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMS 


382 



: I I II : I : I 



Db 421 PCEHGGRCLNTKGSFQCKCLQGYEGPRCEMDV NEC-KSNPC--QNDATCLDQI 470 

Qy 383 G--ECACKPGWSGLYCNET CSPGFYGEACQ QIC 413 

I i I I I : I : : I I I I I I 1 I 

Db 471 GGFHCICMPGYEGVFCQINSDDCASQPCLNGKCIDKINSFHCECPKGFSGSLCQVDVDEC 530 

Qy 414 S CQNGADCDSVTGK— CTCAPGFKGIDC STPCPL GTYGINCSSR 455 

: I : I I I 1 I I I I I I I I I I : I I I I I 

Db 531 ASTPCKNGAKCTDGPNKYTCECTPGFSGIHCELDINECASSPCHYGVCRDGVASFTCDCR 590 

Qy 456 CG CKNDAVCSPVDGS — CTCKAGWHGVDCSIR 485 

I hi I : : I I I I I I : I I 

Db 591 PGYTGRLCETNINECLSQPCRNGGTCQDRENAYICTCPKGTTGWCEINIDDCKRKPCDY 650 

Qy 486 CPSGTWGFGCNLTCQ CLNGGACNTLDG TCTCAPGWRG 522 

I I I lh I I I I ! : I I III I : I 

Db 651 GKCIDKINGYECVCEPGYSGSMCNINIDDCALNPCHNGGTC— IDGVNSFTCLCPDGFRD 708 

Qy 523 EKC ELPCQDGTYGLNC AERC DCSHADGCHP 552 

I I : I I II I I : I 

Db 709 ATCLSQHNECSSNPCIHGSCLDQINS YRCVCEAGWMGRNCDININECLSNPCVNGGTCKD 768 

Qy 553 -TTGH-CRCLPGWSGVHCDSVCAEGRWGP NCSL 583 

I : I : I I I : I I : I I I III 

Db 7 69 MTSGYLCTCRAGFSGPNCQMNINECASNPCLNQGSCIDDVAGFKCNCMLPYTGEVCENVL 82 8 

Qy 584 -PCY CKNGASCSPDD GICE 601 

I I I I II i : I : I I 

Db 82 9 APCSPRPCKNGGVCRESEDFQSFSCNCPAGWQGQTCEVDINECVRNPCTNGGVCENLRGG 8 88 

Qy 602 CAPGFRGTTCQR ICSPGFYGHRCSQTCPQCV 632 

I I I I I I : : I I I I I I : : : I I 

Db 889 FQCRCNPGFTGALCENDIDDCEPNPCSNGGVCQDRVNGFVCVCLAGFRGERCAEDIDECV 94 8 

Qy 633 HSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNC AGICT CTNNGTCNPIDR 686 

III: lh : I I : I I : I III II I I I 1 I : I 

Db 94 9 — SAPCRNGGNCTDCVNSYT CS — CPAGFSGINCEINTPDCTESSCFNGGTC — VDG 9 99 

Qy 687 SCQCYPGWIGSDC SQP CPPAHWGPNC IH 714 

I I I I I : I : I I : I I I : I I I : 

Db 1000 ISSFSCVCLPGFTGNYCQHDVNECDSRPCQNGGSCQDGYGTYKCTCPHGYTGLNCQSLVR 1059 

Qy 715 TCN CHNGAFC — SAYDGECKCTPGWTGLYCTQ 7 44 

I : I I I I I : I I I I I : 1 I 

Db 1060 WCDS S PCKNGGS CWQQGAS FTCQCAS GWTGI YCDVP S VS CEVAARQQGVS VAVLCRHAGQ 1119 

Qy 745 RCPLGFYGKDC ALICQ CQNGADC-DHISG-QCTCRTGFMGRHCE 786 

II I : I I II I 1 I I I I I : : I II I : I : I 

Db 1120 CVDAGNTHLCRCQAGYTGSYCQEQVDECQPNPCQNGATCTDYLGGYSCECVPGYHGMNCS 1179 

Qy 787 QK CPSGTYGYGCRQICD CLNN 807 

: : I I I I I I I II 

Db 1180 KEINECLSQPCQNGGTCIDLVNTYKCSCPRGTQGVHCEIDIDDCSPSVDPLTGEPRCFNG 1239 

Qy 808 STC-DHITG-TCYC3PGWKGARCDQAGVIIVGNLNSLSRTSTALPADSY 854 

I I : I II I : I I I : I : : I : I : I I 

Db 12 4 0 GRCVDRVGGYGCVCPAGFVGERCE GDVNE-CLSDPCDPSGS Y 12 80 



RESULT 10 
A24420 

notch protein - fruit fly (Drosophila melanogaster ) 
N;Alternate names: neurogenic repetitive locus protein 
C; Species: Drosophila melanogaster 

C;Date: 10-Sep-1999 ttsequence_revision 10-Sep-1999 #text_change 10~Sep-1999 
C;Accession: A24420; A24768; S09358; A05267 
R;Kidd, S.; Kelley, M.R.; Young, M.W. 
Mol. Cell. Biol. 6, 3094-3108, 1986 

A;Reference number: A24420; MUID : 87 064 624 ; PMID:3097517 

A; Accession : A2 4 42 0 

A;Molecule type: DNA 

A; Residues: 1-2703 <KID> 

A/Cross-references: GB:K03508; NID:gl57991; P I DN : AAA2 8 725.1; PID:gl57993 
R;Wharton, K.A. ; Johansen, K.M.; Xu, T . ; Artavanis -Ts akonas , S. 
Cell 43, 567-581, 1985 

A/Reference number: A24768; MUID : 8 607 9539 ; PMID:3935325 
A; Accession : A2 4 7 68 
A;Molecule type: mRNA 

A; Residues : 1-48, 'I', 50-118, 'R', 12 0-23 0, ' I * ,232-256, ' N ' , 258-2 66 , 1 A' ,2 68- 
8 72, ' R', 874 -95 8, ' R ' , 960- 197 0 , ' FH f , 1973-2256, ' G ' , 22 5 8-22 64 , 'V',2266- 
2 4 06, 'R' ,2408-2444, «L' ,2446-2703 <WHA1> 

A; Note: the authors translated the codon ATC for residue 49 as Thr, ATT for 
residue 2044 as Arg, GTA for residue 2265 as Ala, CGC for residue 2407 as His, 
and CTT for residue 2445 as Arg 
R;Tautz, D. 

Nucleic Acids Res. 17, 6463-6471, 1989 

A;Title: Hypervariability of simple sequences as a general source for 
polymorphic DNA markers. 

A;Reference number: S09358; MUID : 8 9385974 ; PMID:2780284 
A; Accession : SO 935 8 
A;Molecule type: DNA 

A; Residues: 25 05-2 551, 1 QQQQ ' ,2552-2 57 6,' E', 2578-2 604 <TAU> 

R;Wharton, K.A. ; Yedvobnick, B.; Finnerty, V.G.; Artavanis-Tsakonas , S. 

Cell 40, 55-62, 1985 

A; Title: opa : a novel family of transcribed repeats shared by the Notch locus 
and other developmentally regulated loci in D. melanogaster. 
A;Reference number: A05267; MUID : 85 09932 9 ; PMID:2981631 
A; Accession: A05267 
A;Molecule type: DNA 

A/Residues: 2504-2576, ' E 2578-2611 <WHA2> 

C; Genetics : 

A; Gene: notch; opa 

A; Cross-references : FlyBase : FBgn0 004 64 7 
A; Map position: 8.96-9.36 

A;Introns: 53/3; 84/3; 171/3; 240/3; 283/3; 2333/3; 2436/3; 2588/3 
C; Superf amily : notch protein; ankyrin repeat homology; EGF homology 
C; Keywords: differentiation; tandem repeat; transmembrane protein 
F; 27-4 3/Domain : transmembrane #status predicted <TMM1> 
F;297-328/Domain: EGF homology <EGX1> 
F; 53 0-5 6 1/Domain : EGF homology <EGF1> 
F; 568-59 9/Domain: EGF homology <EGF> 
F; 98 8-10 19/Domain : EGF homology <EGX2> 
F; 10 64- 10 95 /Domain : EGF homology <EGF3> 
F; 1187-12 18 /Domain : EGF homology <EGX3> 



F; 17 4 6 - 17 62 /Domain : transmembrane ((status predicted <TMM2> 

F; 19 5 0-1 9 82/ Domain : ankyrin repeat homology <AN1> 

F; 19 8 3-2 01 5/ Domain : ankyrin repeat homology <AN2> 

F; 198 8-2 0 04 /Domain : transmembrane ((status predicted <TMM3> 

F; 2 017-2 04 9/Domain : ankyrin repeat homology <AN3> 

F; 2 05 0-2 08 2 /Domain : ankyrin repeat homology <AN4> 

F; 2083-2115/ Domain : ankyrin repeat homology <AN5> 

F; 25 3 8-2 5 68 /Region : glut amine- rich 

F; 2 53 8-2 5 68 /Domain : neurogenic repetitive element ((status predicted <OPA> 

Query Match 14.5%; Score 978.5; DB 1; Length 2703; 

Best Local Similarity 26.8%; Pred. No. 3e-44; 

Matches 290; Conservative 102; Mismatches 297; Indels 395; Gaps 70; 

Qy 7 SCL SFICLLLCHWIGTASPLNLED— PNVCSHWESYSVTVQESYPHPFDQIYYTSC 60 

III : I I : : : | I : : : : : II : : I 

Db 502 SCLDDPGTFRCVCMPGFTGTQCEIDIDECQSNPC LNDGTC 541 

Qy 61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMC VPHCADKCVHGR 117 

I : I I I I : | | | : | I : I : | 

Db 542 HDKINGFKCS CALGF--TGARCQINIDDCQSQPCRNR 576 

Qy 118 CIAPNTCQCEPGWGGTNCS SACDGDHWGPHCTSRCQCKNGALCNPITG-ACH 168 

I I : I : I I I : I I : I : I I : II : : I 

Db 577 GICHDSIAGYSCECPPGYTGTSCE1NINDCDSN PCHRGKCIDDVNSFKCL 626 

Qy 169 CAAGFRGW RCEDR CEQGTYG NDCHQRCQ 19 6 

I I : I : I : I I hill h I I 

Db 627 CDPGYTGYICQKQINECESNPCQFDGHCQDRVGSYYCQCQAGTSGKNCEVNVNECHSN-P 685 

Qy 197 CQNGATC-DHVTG-ECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVC-HHVTG-E 252 

I I II II I : : h I I h I I II h : I I I I I II I I : 

Db 68 6 CNNGATCIDGINSYKCQCVPGFTGQHCE KNVDECI S-SPCANNGVCIDQVNGYK 738 

Qy 253 CSCPSGWMGTVC GQP CPEGRFGKNCS QECQ-- 2 82 

I I I I : I I I I I I I I II 

Db 739 CECPRGFYDAHCLSDVDECASNPCVNEGRCEDGINEFICHCPPGYTGKRCELDIDECSSN 798 

Qy 28 3 -CHNGGTC-DAATG-QCHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCY-HV 335 

I : I I I I I I I I I I I h : h hi I I I I I I I 

Db 799 PCQHGGTCYDKLNAFSCQCMPGYTGQKCETNIDDC VTNPCGNGGTCIDKV 848 

Qy 336 SG-ACLCEAGFAGERCEARLCPEGLYGI KC-DKRCPCHLENTHSCHPMSG ECACKP 389 

: I I : I : I I I I : : : I I II : I 1 I I III 

Db 84 9 NGYKCVCKVPFTGRDCESKMDP CASNRC KNEAKCTPSSNFLDFSCTCKL 8 97 

Qy 390 GWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTG— KCTCAPGFKGIDC 439 

I : : I I I : I : I h I I I I : I I : I I h : I I I 

Db 898 GYTGRYCDEDI DECSLSSPCRNGASCLNVPGSYRCLCTKGYEGRDCAINTDDCA 951 

Qy 440 STPCP LGTYGINC SSRC GCKNDAVCS PVDGS - -CTCK 474 

III : I I I I h I I I I I I I I 

Db 952 SFPCQNGGTCLDGIGDYSCLCVDGFDGKHCETDINECLSQPCQNGATCSQYVNSYTCTCP 1011 

Qy 475 AGWHGVDCSIR CPSGTWGFGCN LTCQ CLN 503 

I: I: : I I I lil: II III 

Db 1012 LGFSGINCQTNDEDCTESSCLNGGSCIDGINGYNCSCLAGYSGANCQYKLNKCDSNPCLN 1071 



Qy 


504 


GGACNTLDG — TCTCAPGWRGEKC ELPCQDGTYGLNCAERCDCSHADGCHPT 

II:: I I I | : | : : | : 1 I : : I | | | 
GATCHEQNNEYTCHCPSGFTGKQCSEYVDWCGQSPCENG ATCSQMK--HQF 


553 


Db 


1072 


1120 


Qy 


554 


TGHCRCLPGWSGVHCD SVCAEGRWGPNCSLPCYCKNGASCSP — DDGICECAPGFRG 

: hi 11:1 II 1 : II 1 1 1 : 1 : : I I : | : I 
S — CKCSAGWTGKLCDVQTISCQDAADRKGLSLRQLCNNG-TCKDYGNSHVCYCSQGYAG 


608 


Db 


1121 


1177 


Qy 


609 


TTCQR ICSPGFYGHRCSQTCPQCV HSSGPCH 


639 


Db 


1178 


: 1 1 : 1 1 1 1 1 1 : 1 II 
SYCQKEIDECQSQPCQNGGTCRDLIGAYECQCRQGFQGQNCELNIDDCAPNPCQNGGTCH 


1237 


Qy 


640 


H--ITGLCDCLPGFTGALC NEVCPSGRFGKNCAGICTCTNNGTCNPIDR SCQC 

: 11111:1 : 1 1 Mill II 
DRVMNFSCSCPPGTMGI ICEINKDDCKPG ACHNNGSC — I DRVGGFECVC 


690 


Db 


1238 


1285 


Qy 


691 


YPGWIGSDC SQPCP PAHWGPNCIHTCN 


717 


Db 


1286 


1 1 : : 1 : 1 III 1 1 1 : 1 1 : 

QPGFVGARCEGDINECLSNPCSNAGTLDCVQLVNNYHCNCRPGHMGRHCEHKVDFCAQSP 


1345 


Qy 


718 


CHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHIS GQC 

Mil: : 1 : 1 1 llllhl 1 : 1 1 1 1 II 
CQNGGNCNIRQ SGHHCI--CNNGFYGKNCEL SGQDCDSNPCRVGNC 


774 


Db 


1346 


1389 




7 7 R 

I/O 


ILKibr l v l laKnL.JiyrvU.tr bb 1 I (j i bLK ^) 1 LD L-LJNIJM b 1 LDnl I G — TCYCS PGWKG 

1 1 1 1 1 1 1 1 1 1 1 : I : : I I I Ml 
WADEGFGYRCE— CPRGTLGEHCEIDTLDECSPNPCAQGAACEDLLGDYECLCPSKWKG 


825 


Db 


1390 


1447 


Qy 


826 


ARCD 82 9 
1 1 I 

KRCD 1451 




Db 


1448 





RESULT 11 
S45306 

notch 3 protein - mouse 

C; Species: Mus musculus (house mouse) 

C;Date: 20-Feb-1995 ffsequence_revision 20-Feb-1995 #text_change 02-Aug-2002 
C;Accession: S45306 

R;Lardelli, M. ; Dahlstrand, J.; Lendahl, U. 
Mech. Dev. 46, 123-136, 1994 

A; Title: The novel Notch homologue mouse Notch 3 lacks specific epidermal growth 

factor-repeats and is expressed in proliferating neuroepithelium. 

A; Reference number: S45306; MUID : 95 001556 ; PMID:7918097 

A; Access ion : S 4 53 0 6 

A; Status: preliminary 

A;Molecule type: rnRNA 

A; Residues: 1-2318 <LAR> 

A; Cross-references: EMBL:X74760; NID:g483580; PIDN : CAA5277 6 . 1 ; PID:g483581 

C; Superf amily : notch protein; ankyrin repeat homology; EGF homology 

F; 163- 195/ Domain : EGF homology <EGF1> 

F; 47 4- 5 05/ Domain : EGF homology <EGF> 

F; 854-885/Domain: EGF homology <EGF2> 

F; 1839-1 871/ Domain : ankyrin repeat homology <AN1> 

F; 18 72- 19 04 /Domain : ankyrin repeat homology <AN2> 

F; 1906-1938/Domain : ankyrin repeat homology <AN3> 



F; 1939-197 1/ Domain : ankyrin repeat homology <AN4> 
F; 1 972-2 00 4 /Domain : ankyrin repeat homology <AN5> 

Query Match 14.5%; Score 977.5; DB 2; Length 2318; 

Best Local Similarity 25.8%; Pred. No. 3e-44; 

Matches 322; Conservative 77; Mismatches 317; Indels 532; Gaps 74; 

Qy 9 LSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQI YYTSCTDILNWFK 68 

I I I I I : I I I I I I II : : I I 

Db 62 LEAACLCLPGWVG--ERCQLEDP--C H S G P CAGRGVCQ S S WAGT ARF S 106 

Qy 69 CTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCV PHCADKCVHGR— CIAPN- 122 

I : I I I I i 1:1111 : I : 

Db 107 C RCLRGF — QGPDCSQPDPCVSRPCVHGAPCSYGPDG 141 

Qy 123 — TCQCEPGWGGTNCSSACDGDHWGPHC TSRCQ 153 

I I I I : i : I I I ! I : I I I 

Db 142 RFACACPPGYQGQSCQSDIDECRSGTTCRHGGTCLNTPGSFRCQCPLGYTGLLCENPWP 201 

Qy 154 CKNGALC NPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATC-D 2 04 

I : I I I :: I I I I I I I I : II : I M I I I 

Db 202 CAPS PCRNGGTCRQSS DVT YDCACLPGFEGQNCEWVD DCPGH-RCLNGGTCVD 254 

Qy 205 HV-TGECRCPPGYTGAFC-EDLCPPGKHGPQCE-QRCPCQNGGVCHHVTG--ECSCPSGW 259 

II I : I I I : I I I I I I : : I : I I I I I I : : I I I : I I 

Db 2 55 GVNTYNCQCPPEWTGQFCTEDV DECQLQPNACHNGGTCFNLLGGHSCVCVNGW 3 07 

Qy 2 60 MGTVCGQ PCPEGRFGKNC — SQEC QCHNGGTC 2 89 

III 111:11111! 

Db 308 TGESCSQNIDDCATAVCFHGATCHDRVAS FYCACPMGKTGLLCHLDDACVSNPCHEDAIC 367 

Qy 290 DA — ATGQ — CHCSPGYTGERCQ DECPVGTYGVLCAETCQCWGGKCYHVSGA — CL 340 

I : I : 1111:11 I I I I : I I I : : I : I : I : I 

Db 368 DTNPVSGRAICTCPPGFTGGACDQDVDECSIG ANPCEHL— GRCVNTQGSFLCQ 419 

Qy 341 CEAGFAGERCEARL— CPEGLYGIKCDKRCPCHLENTHSCHPMSGE — CACKPGWSGLYC 3 96 

I I : I I I I : I I II I : I I : I I I : : I I I 

Db 420 CGRGYTGPRCETDVNECLSG PC— RNQATCLDRIGQFTCICMAGFTGTYC 4 67 

Qy 397 NE TCSPGFYGEACQ QICS CQNGADC-DSV 424 

II Ml II I : hlllll 

Db 4 68 EVDIDECQSSPCVNGGVCKDRVNGFSCTCPSGFSGSMCQLDVDECASTPCRNGAKCVDQP 527 

Qy 425 TG-KCTCAPGFKGI DCS-TPCPLGTYGINCSSRCGCKNDAVCSPVDG SC 471 

I : I II I I : I I I I I I I II II I I I 

Db 528 DGYECRCAEGFEGTLCERNVDDCSPDPCHHG RC VDG1ASFSC 569 

Qy 472 TCKAGWHGVDCS 1 RCPSGTWGFGCNLTC-QCLNG- 504 

I I : I : I I I I I I I I : I : 

Db 570 ACAPGYTGIRCESQVDECRSQPCRYGGKCLDLVDKYLCRCPPGTTGVNCEVNIDDCASNP 629 

Qy 505 GACNTLDG TCTCAPGWRGEKCEL PCQDGTYGLNC 538 

II II lllhll: I I I I : I 

Db 630 CTFGVCR — DGINRYDCVCQPGFTGPLCNVEINECASSPCGEGGSCVDGENGFHCLCPPG 687 

Qy 539 AERCDCSHADGCHPTTG--HCRCLPGWSGVHCDSVCAEGRWGPNCSLP 584 

III II I I I I I II I I I I : 



Db 



688 SLPPLCLPANHPCAHKPCSHG-VCHDAPGGFRCVCEPGWSGPRCSQSLA PDACES 741 



Qy 585 CYCKNGASCSPDDGI CECAPGFRGTTCQRI CS 616 

I : I : I : I I I I I I I I I : I I : : | 
Db 742 QPCQAGGTCT-SDGIGFRCTCAPGFQGHQCEVLSPCTPSLCEHGGHCESDPDRLTVCSCP 8 00 

Qy 617 PGFYGHRCSQTCPQCVHSS GPCHHITG--LCDCLPGFTGALCNE 658 

I I : I I I I : I : I I I : : I II I : M I : : 

Db 801 PGWQGPRCQQDVDECAGASPCGPHGTCTNLPGNFRCICHRGYTGPFCDQDIDDCDPNPCL 860 

Qy 659 VCPSGRFGKNCA GICT 674 

I I I I I III 
Db 8 61 HGGSCQDGVGSFSCSCLDGFAGPRCARDVDECLSSPCGPGTCTDHVASFTCACPPGYGGF 92 0 

Qy 675 CTNNGTCNPIDR SCQCYPGWIGSDC SQP C 703 

I I I II : I I I I I I : I : I | : | j 

Db 921 HCEIDLPDCSPSSCFNGGTC — VDGVSSFSCLCRPGYTGTHCQYEADPCFSRPCLHGGIC 978 

Qy 704 PPAHWGPNCIHTCN CHNGAFCSAYDGECKCTPGWTG 739 

I I I I I I I I I I 11111:1 

Db 979 NPTHPGFEC--TCREGFTGSQCQNPVDWCSQAPCQNGGRCVQTGAYCICPPGWSGRLCDI 1036 

Qy 740 --LYCTQR CPLGFYGKDC ALICQCQN 7 63 

111= I I I I I ||: 

Db 1037 QSLPCTEAAAQMGVRLEQLCQEGGKCIDKGRSHYCVCPEGRTGSHCEHEVDPCTAQPCQH 1096 

Qy 764 GADCDHISG--QCTCRTGFMGRHCEQ KCPSGTYGY 796 

II I I I I : I I I 

Db 1097 GGTCRGYMGGYVCECPAGYAGDSCEDNIDECASQPCQNGGSCIDLVARYLCSCPPGTLGV 1156 

Qy 7 97 GC RQICD CLNNSTCDHITG— TCYCSPGWKGARCD 82 9 

I M II : I I I : I I I I I : I I : 

Db 1157 LCEINEDDCDLGPSLDSGVQCLHNGTCVDLVGGFRCNCPPGYTGLHCE 1204 



RESULT 12 
S78549 

notch3 protein - human 

C; Species: Homo sapiens (man) 

C;Date: 24-Jul-1998 #sequence_revision 24-Jul-1998 #text_change 08-Sep-2002 

C;Accession: S78549; S71825 

R;Joutel, A.; Tournier-Lasserve, E. 

submitted to the EMBL Data Library, April 1997 

A;Reference number: S78549 

A; Access ion: S7 85 4 9 

A; Molecule type: mRNA 

A; Residues: 1-2321 <JOUl> 

A;Cross-references: EMBL:U97669; NID : g2 668591 ; PIDN : AAB9 137 1 . 1 ; PID:g2668592 
R;Joutel, A.; Corpechot, C. ; Ducros, A.; Vahedi, K. ; Chabriat, H. ; Mouton, P.; 
Alamowitch, S.; Domenga, V.; Cecillion, M. ; Marechal, E. ; Maciazek, J.; 
Vayssiere, C.; Cruaud, C; Cabanis, E.A.; Ruchoux, M.M. ; Weissenbach, J.; Bach, 
J.F.; Bousser, M.G.; Tournier-Lasserve, E. 
Nature 383, 707-710, 1996 

A; Title: Notch3 mutations in CADASIL, a hereditary adult-onset condition causing 
stroke and dementia. 

A; Reference number: S71825; MUID : 97 03272 8 ; PMID:8878478 
A; Access ion : S71825 



A; Status: nucleic acid sequence not shown 
A;Molecule type: DNA 

A; Residues : 67-113 ; 138-194 ; 2 68-333, ' G ' , 335-34 6; 536-613 ; 716-7 65 ; 124 0-127 9; 1815- 
1888 <JOU2> 

A; Cross-references : EMBL:U97 669 

C ; Genetics : 

A; Gene: notch3 

A; Map position: 19pl3.1 

C; Function : 

A; Description : may be involved in pathogenesis of CADASIL, causing a type of 
stroke and dementia 

C; Superf amily : notch protein; ankyrin repeat homology; EGF homology 

C; Keywords: tandem repeat; transmembrane protein 

F; 123-155/Domain : EGF homology <EGX1> 

F; 162- 194 /Domain : EGF homology <EGF1> 

F;240-271/Domain: EGF homology <EGX2> 

F;318-349/Domain: EGF homology <EGF> 

F; 4 7 3-5 04 /Domain : EGF homology <EGX3> 

F; 853-8 84 /Domain : EGF homology <EGF3> 

F; 92 8 -95 9 /Domain : EGF homology <EGX4> 

F; 1838-1870/Domain: ankyrin repeat homology <AN1> 

F; 1871-1903/Domain: ankyrin repeat homology <AN2> 

F; 1905-1937/Domain : ankyrin repeat homology <AN3> 

F; 1938-1970/ Domain : ankyrin repeat homology <AN4> 

F; 1971-2003/Domain: ankyrin repeat homology <AN5> 



Query Match 14.4%; Score 974; DB 2; Length 2321; 

Best Local Similarity 25.0%; Pred. No. 4.5e-44; 

Matches 304; Conservative 93; Mismatches 282; Indels 537; Gaps 70; 



Qy 


59 


SCTDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKC 


113 


Db 


250 


: 1 1 : 1 : 1 | | : : | : | : | 

TCVDGVNTYNC QCPPEW--TGQFCTED-VDECQLQPN 


283 


Qy 


114 


-VH--GRC IAPNTCQCEPGWGGTNCSSACDG DHWGPHCTSR CQC 


154 


Db 


284 


III : : : | | | | | : | | | III II 

ACHNGGTCFNTLGGHSCVCVNGWTGESCSQN1DDCATAVCFHGATCHDRVASFYCACPMG 


343 


Qy 


155 


KNGALC NP I TG- -ACHCAAGFRGWRCE DRCE 

MM 1 I : I | | | | | | : || 
KTGLLCHLDDACVSNPCHEDAICDTNPVNGRAICTCPPGFTGGACDQDVDECSIGANPCE 


183 


Db 


344 


403 


Qy 


184 


QGTYGNDCHQ RCQ CQNGATCDHVTGE--CRCPPGYTG 

II- Is II: 1 : 1 1 II 1 : 1 1 1 : 1 1 
HLGRCVNTQGSFLCQCGRGYTGPRCETDVNECLSGPCRNQATCLDRIGQFTCICMAGFTG 


218 


Db 


404 


463 


Qy 


219 


AFCE DLCPPGKHGPQCEQRCPCQNGGVC-HHVTG-ECSCPSGWMGTVC 

Ml M 1 1 1 1 1 1 1 1 II 1 : 1 1 M : 1 : 1 
TYCEVDIDEC QSS PCVNGGVCKDRVNGFSCTCPSGFSGSTCQLDVDECAS 


264 


Db 


464 


513 


Qy 


265 


GQP - CPEGRFGKNCSQ ECQ CHNGGTCDA-ATGQCHCSPG 

M IN 1 1 : M 1 1 : 1 1 I : 1 1 : 1 1 
TPCRNGAKCVDQPDGYECRCAEGFEGTLCDRNVDDCSPDPCHHGRCVDGIASFSCACAPG 


301 


Db 


514 


573 


Qy 


302 


YTGERCQDE CPVGTYGVLC AETCQ 

Mill:: || || | | | : | 
YTGTRCESQVDECRSQPCRHGGKCLDLVDKYLCRCPSGTTGVNCEVNIDDCASNPCTFGV 


325 


Db 


574 


633 



Qy 32 6 CVNGGKCYHVSGACLCEAGFAGERCEARL CPEGLYGIKCDKRCP 3 69 

I : I I I : I : I I I I : I : I | : | | | 

Db 634 CRDGINRYD CVCQPGFTGPLCNVEINECASSPCGEGGSCVDGENGFRC — LCPPGS 687 

Qy 370 CHLENTHSC HPMSGECACKPGWSGLYCNE 398 

I I : I I I I I : I I II I I : : 

Db 688 LPPLC-LPPSHPCAHEPCSHGICYDAPGGFRCVCEPGWSGPRCSQSLARDACESQPCRAG 746 

Qy 399 TCSPGFYGEACQQI--CS CQNGADCDSVTGK CTCAPGFKG — 436 

I I 1 I I I : : I : I : : I I : I I : I : I | : : | 
Db 747 GTCSSDGMGFHCTCPPGVQGRQCELLSPCTPNPCEHGGRCESAPGQLPVCSCPQGWQGPR 806 

Qy 437 IDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTW 491 

: I : I I I : I I i : : I I I I I I : I I 
Db 807 CQQDVDECAGPAPCGPHGI-CTNLAG SFSCTCHGGYTGPSCDQDIND 852 

Qy 4 92 GFGCNLTCQCLNGGACNTLDG TCTCAPGWRGEKC ELPCQDGTYGLNCA 53 9 

I : I I I I I : I II : I : I I I : I : I I I | | | 
Db 853 CDPN-PCLNGGSCQ — DGVGSFSCSCLPGFAGPRCARDVDECLSNPCGPGT CT 902 

Qy 540 ERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGI 599 

• I = I I Ih I II: I : I I I i I I : I I I : 

Db 903 D HVASFTCTCPPGYGGFHCEQDL PDCS-PSSCFNGGTCV--DGV 943 

Qy 600 CECAPGFRGTTCQR ICSPGFYGHRC SQTCPQC 631 

I I M: I II : | | Ml I I I I I 

Db 944 NSFSCLCRPGYTGAHCQHEADPCLSRPCLHGGVCSAAHPGFRCTCLESFTGPQCQTLVDW 1003 

Qy 632 VHSSGPCHHITGLCDCLPGFTGALCNE 658 

: I I I I I I : : I I I : 

Db 1004 CSRQPCQNGGRCVQTGAYCLCPPGWSGRLCDIRSLPCREAAAQIGVRLEQLCQAGGQCVD 1063 

Qy 659 VCPSGRFGKNC AGIC TCTNNGTCNPI--DRSCQCYPGWIGSDC 699 

MINIM I I : I I I I : I I I : I : I 

Db 1064 EDSSHYCVCPEGRTGSHCEQEVDPCLAQPCQHGGTCRGYMGGYMCECLPGYNGDNCEDDV 1123 

Qy 700 SQPC PPAHWGPNCIHTCNCH 719 

MM | | | | | : | | 

Db 1124 DECASQPCQHGGSCIDLVARYLCSCPPGTLGVLCEINEDDCGPGPPLDSGPRCLHNGTCV 1183 

Qy 720 N--GAFCSAYDGECKCTPGWTGLYCTQ RCPLG 749 

: I I I I I I M II I || 

Db 1184 DLVGGF RCTCPPGYTGLRCEADINECRSGACHAAHTRDCLQDPGGGFRCLCHAG 123 7 

QY 750 FYGKDCALI CQ CQNGADCDH1SG QCTCRTGFMGRHCEQ 787 

Ml: I : I I M I I || MM: 

Db 1238 FSGPRCQTVLSPCESQPCQHGGQCRPSPGPGGGLTFTCHCAQPFWGPRCERVARSCRELQ 1297 

Qy 788 KCPSGTYGYGCRQI CDCLNNSTCDHIT 814 

II I I I I I I : : I 

Db 1298 CPVGVPCQQTPRGPRCACPPGLSGPSCRSFPGSPPGASNASCAAAPCLHGGSCRPAPLAP 1357 

Qy 815 -GTCYCSPGWKGARCD 82 9 

I I : I I I II : 
Db 135 8 FFRCACAQGWTGPRCE 137 3 



RESULT 13 
T31070 

notch homolog - sea urchin (Lytechinus variegatus) 
C; Species: Lytechinus variegatus (variegated urchin) 

C;Date: 22-Oct-1999 #s equence_revis ion 22-Oct-1999 #text_change 31-Jan-2000 
C;Accession: T31070 
R;Sherwood, D.R.; McClay, D.R. 
Development 124, 3363-3374, 1997 

A;Title: Identification and localization of a sea urchin Notch homologue: 
insights into vegetal plate regionalization and Notch receptor regulation. 
A;Reference number: Z20966; MUID : 97454256; PMID: 9310331 
A; Accession: T3107 0 

A; Status: preliminary; translated from GB/EMBL/DDB J 
A;Molecule type: mRNA 
A; Residues: 1-2531 <SHE> 

A;Cross-references: EMBL : AFO 0 0 634; NID : g2 57 03 5 0 ; PID : g2 57 0351 ; PIDN : AAB82088 . 1 
C; Superf amily : notch protein; ankyrin repeat homology; EGF homology 

Query Match 14.3%; Score 964.5; DB 2; Length 2531; 

Best Local Similarity 25.6%; Pred. No. 1.6e-43; 

Matches 303; Conservative 92; Mismatches 312; Indels 477; Gaps 74; 

QY 6 NSCLSFICLLLCHWIGTASPLNLED--PNVCSHWESYSVTVQESYPHPFDQIYYTSCTDI 63 

I : : I I : : I I : I : I I I I : | | 

Db 336 NTYGNFSCICVRGWEGQTCEINKDDCTPNPCQ FEGECEDR 375 

QY 64 LNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRC 118 

Ml I IN : | : | | | |: | 

Db 376 VASFKCT CPPG— RTGLLC— HLEDACMSNPCHHTAQ 408 

QY H9 IAPNT--CQCEPGWGGTNCSSACDGDHWGPHCTSRCQ--CKNGALCNPITG--ACH 168 

: 1 I I : I I I I I I : I : : I I I : I 

Db 409 CSTSWDGSFICDCATGYQGFNCSEDID ECSLSMDSICQSGGTCQNFDGGWSCL 462 

Qy 169 CAAGFRGWRCE DRCE QGTYG NDCHQRCQ 196 

I I I I I I I II: : I I : 

Db 4 63 CSSGFTGSRCETDIDECDDDPCYNGGTCLNKRGGYACICLTGFTGTLCETDINECSSN-P 521 

Qy 197 CQNGATCDHVTG--ECRCPPGYTGAFCE DLCPPGK 229 

I I I I : I = I I I I I I I I I I : || | 

Db 522 CLNGASCFDITGRFECACLAGYTGTTCQVNIDDCQSSPCENGGTCIDGVNQFTCLCETGY 581 

Qy 230 HGPQCE QRC PCQNGGVCHHVTG — E C S C P S GWMGT VC GQP 267 

I : I I I I I I i ! I I : | : | : M | I I I 

Db 582 EGHRCEMDSDECASRPCMNGGVCEDLIGFYQCNCPVGTSGDNCEYNHYDCSSNPCVNDGT 641 

Qy 268 CPEGRFGKNCSQ ECQ CHNGGTC-DAATG 294 

I M IN: : I : I I II I I I I I 
Db 642 CVDGINEYTCMCHEGYRGLNCEEDIDDCESRPCHNGGTCVDEVNGYHCLCPIGYHDPFCM 701 

Qy 2 95 QCHCSPGYTGERCQ DECPVGTYGVLCAETCQCV 32 7 

I I I I I I I I Ml : I 

Db 7 02 SNINECSSNPCVNGGSCHDGVNEYSCECMAGYTGTRCTDDFDEC SSNPCQ 751 

Qy 32 8 N GGKC - - YH VS G AC L C E AG FAG E RC EARL - CPEGLYG 3 61 



Db 



752 HGGTCDNRHAFYNCTCQAGYTGLNCEVNI DDCVDEPCLNGGI CI DEVNS FQCVCPQTFVG 811 



Qy 362 IKCD-KRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQ ICS 414 

: I : : I I I : I I : : I I I : I I I I 

Db 812 LLCETERSPC EDNQCQ-NGATCVYSEDYAGYSCR — CTSGFQGNFCDDDRNECLFSP 865 

Qy 415 CQNGADCDSVTG — KCTCAPGFKGI DC STP CPLGT 447 

I : I I I : : I : I : I I I : I I II 11 

Db 866 CRNGGSCTNLEGSFECSCLPGYDGPICEINIDECASGPCTNGGICTDLIDDYFCSCQRGF 925 

Qy 448 YGINC SSRC GCKNDAVCSP-VDG-SCTCKAGWHGVDCSIRCPSGTWGFGCNLTC 499 

III : I I : I I I I I : I : I I : I : I I I I 
Db 92 6 TGKNCQNDTDECLSSPCRNGATCHEYVDS YTCSCLVGFSGMHCEINDQDCT TS 97 8 

Qy 500 QCLNGGACNTLDG TCTCAPGWRGEKCEL PCQDGTYGLNCAER 541 

11111:11 Nihil:: I I : : I I : I 

Db 97 9 SCLYGGTC--IDGWSYTCECVTGYTGSNCQIEINECDSDPCENGA TCQDRFGSYSC 1033 

Qy 542 -CD CSH-ADGCHP TTGH CRCLPGWSGVHCD 569 

II I I I I I I I I I I I I 

Db 1034 HCDVGFTGLNCEHWQWCSPQNNPCYNGATCVAMGHLYECHCASNWIGKLCDVPKVSCDI 1093 

Qy 570 SVCAEGRWGPNC SLPCYCKN GASCSPDDGI CECAPGFRGTTC 611 

: I II | | | : : | : | : | | 1 | | | 

Db 1094 AASDKNVTRSELCLN GGTCIDATSSHSCLCQDGYTGSYCEWIDECASAPCHNGGTC 1150 

Qy 612 QR ICSPGFYGHRCSQTCPQCVHSSGPCHHITGLC-DCLPGFTGALCNEVCPSG 663 

I II I I I I I : I : I I II : I I I : I : I I : I I : I 

Db 1151 TDGVYSYTCSCLPGFEGPRCQQNINEC— ASSPCHN-GGQCHDMVNGYT CS — CPAG 1202 

Qy 664 RFGKNCA GIC TCTNNGTCNPIDR SCQCYPGWIGSDC SQPCPP 7 05 

I : I : I I : I I I I : : I I I : : I i I I I I 

Db 12 03 TQGTDCSINLDDCYEGACYHGGVC— IDQVGTYTCDCPLGFVGQHCEGDVNECLSNPCDP 12 60 

Qy 706 AHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQ CQ 762 

: I : I : : I I I I : I I I I I II II 

Db 1261 V-GSQDCVQLINNY QCVCKPGYTGQDCEQEIP NCQNDPCQ 1299 

Qy 763 NGADC — DHI SGQCTCRTGFMGRHCEQK CPSG 792 

I I I I I I I I I I I I I I I 

Db 1300 NNGLCLPSDEGYYCDCLRGFTGVHCETKLTPCGTHPCQNEGTCMEYGDDFDDYTCMCPSG 1359 

Qy 793 TYGYGCR QICDCLNNSTC— DHITGTCYCSPGWKGARC 828 

II | : | | | : : I I I I I 

Db 1360 VSGDNCEIDYNECASSPCINGGTCLDEYGQYRCDCPATWNGRNC 14 03 



RESULT 14 
A40136 

fibropellin la - sea urchin ( Strongylocent rotus purpuratus ) 
N;Alternate names: epidermal growth factor homolog precursor 
N;Contains: alternatively spliced fibropellin lb (EGFI) 
C; Species: Strongylocentrotus purpuratus (purple urchin) 

C;Date: 13-May-1992 #s equence_revision 17-Sep-1997 #text_change 21-Jul-2000 

C;Accession: A40136; B40136; C40136; A29316; A43131 

R; Delgadillo-Reynoso, M.G.; Rollo, D.R.; Hursh, D.A.; Raff, R . A. 

J. Mol. Evol. 29, 314-327, 1989 



A; Title: Structural analysis of the uEGF gene in the sea urchin 
Strongylocentrotus purpuratus reveals more similarity to vertebrate than to 
invertebrate genes with EGF-like repeats. 

A; Reference number: A40136; MUID : 90112 459 ; PMID:2514273 
A/Accession: A40136 
A; Status : preliminary 
A;Molecule type: mRNA 
A; Residues: 1-114 <DEL> 

A/Cross-references: GB:X17530; NID:gl0225; PID:g667061 
A;Accession: B40136 

A; Status: preliminary; not compared with conceptual translation 
A;Molecule type: DNA 

A;Residues: 181-251, 329-370 , ■ R' , 372-4 08 , ' RA l , 4 11-441 <DE2> 
A; Accession : C4 013 6 

A; Status: preliminary; not compared with conceptual translation 
A;Molecule type: DNA 

A;Residues: 1 K 747-821, 898-978 <DE3> 
R;Hursh, D.A.; Andrews, M.E.; Raff, R.A. 
Science 237, 1487-1490, 1987 

A; Title: A sea urchin gene encodes a polypeptide homologous to epidermal growth 
factor . 

A;Reference number: A29316; MUID : 87 31 9 67 7 ; PMID:3498216 
A; Accession : A2 9316 
A; Status : preliminary 
A; Molecule type: mRNA 

A;Residues: ' S ', 280-481, 786-1064 <HUR> 

A;Cross-references: GB:M17421; NID:gl61474; PIDN : AAA30050 . 1 ; PID:g552260 
R;Hunt, L.T.; Barker, W.C. 
FASEB J. 3, 1760-1764, 1989 

A; Title: Avidin-like domain in an epidermal growth factor homolog from a sea 
urchin . 

A;Reference number: A43131; MUID : 89196806; PMID:2784773 
A; Contents: annotation 

C; Comment: EGF homology repeats 10-17 are spliced out in the short form 
(fibropellin lb) . 

C; Superf amily: Clr/Cls repeat homology; EGF homology 

F; 1-1 9 /Domain: signal sequence ftstatus predicted <SIG> 

F;20-1064/Product : fibropellin I ffstatus predicted <FIB> 

F; 23-54/Domain: EGF homology <EG01> 

F; 57-175/Domain: Clr/Cls repeat homology <ClR> 

F; 180-211/Doraain: EGF homology <EG02> 

F; 2 18-24 9 /Domain : EGF homology <EG03> 

F;256-287/Domain: EGF homology <EG04> 

F; 2 94-32 5 /Domain : EGF homology <EG05> 

F; 3 32 -3 63 /Domain : EGF homology <EG06> 

F; 370-401/Domain: EGF homology <EG07> 

F; 4 08-4 3 9 /Domain : EGF homology <EG08> 

F; 4 4 6- 47 7 /Domain : EGF homology <EG09> 

F; 4 8 4- 5 15 /Domain : EGF homology <EG10> 

F; 522-553/Domain: EGF homology <EG11> 

F; 560-591/Domain: EGF homology <EG12> 

F; 5 98-62 9/ Domain : EGF homology <EG13> 

F; 6 3 6- 6 67 /Domain : EGF homology <EG14> 

F; 674-705/Domain: EGF homology <EG15> 

F;712~743/Domain: EGF homology <EG16> 

F; 7 5 0-7 81 /Domain : EGF homology <EG17> 

F; 788-819/Domain: EGF homology <EG18> 



F; 82 6- 8 57 /Domain : EGF homology <EG19> 
F; 864-895/Domain: EGF homology <EG20> 
F; 902- 933/Domain : EGF homology <EG21> 
F; 93 6- 10 64 /Region: avidin-like 

F; 23-34, 28-43, 45-54, 62-88, 180-191, 185-200,202-211,218-229,223-238,240-24 9,256- 

2 67,2 61-27 6,27 8-2 8 7,2 94-305,2 99-314, 316-325,3 32-343,337-352, 354-363,37 0-3 81,375 

3 90, 392-4 01, 4 08-419, 413-42 8, 430-439, 44 6-4 57, 4 51-4 66, 4 68-477, 4 84-4 95/Disul fide 
bonds: ffstatus predicted 

F; 48 9-504, 50 6-515, 522-533, 52 7-542, 54 4-553, 5 60-571,565-58 0,582-591, 598-60 9, 603- 
618, 62 0-62 9, 636-647 , 64 1-656, 658-667 , 674-685, 67 9-694, 696-7 05, 712-72 3, 717-7 32, 734 
7 4 3, 75 0-7 61, 755-77 0 , 772-7 81, 7 8 8-7 99, 7 93-8 08 , 8 10-819, 8 2 6-8 37 , 831-8 4 6, 8 4 8-857 , 864 
875, 869-884, 886-895, 902-913, 907-922, 92 4-933 /Di sulfide bonds: #status predicted 

Query Match 14.2%; Score 954.5; DB 2; Length 1064; 

Best Local Similarity 28.0%; Pred. No. 2.5e-43; 

Matches 290; Conservative 93; Mismatches 305; Indels 347; Gaps 69; 

Qy 3 0 DPNVCSHWESYSVTVQESYPHPFDQI YYTSCTD1LNWFKCTRHRVSYRTAYRHGEKTMYR 8 9 

I I I : I : : i I I : : I : M 

Db 181 DPNLCQNG AACTDLVNDYACT 201 

Qy 90 RKSQCCPGFYESGEMC VPHCA-DKCVHGRCIAPN TCQCEPGWGGTNCSSACDG 141 

I I I I : I I : I I I I : I I I I I : I I : : 

Db 202 CPPGF — TGRNCEIDIDECASDPCQNGGACVDGVNGYVCNCVPGFDGDECENNIN- 254 

Qy 142 DHWGPHCTSRCQCKNGALCNPITGA CHCAAGFRGWRCE DRCEQGTYGNDCHQR 194 

I i I I I : I : I I I I I I I I I I II 
Db 255 ECAS - S PCLNGGI C- - VDGVNMFECT CLAGFTGVRCEVNI DECAS 296 

Qy 195 CQCQNGATC-DHVTG-ECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGE 252 

I I I I I I : I I I I I : : I I I : : • I II Ml I : 

Db 297 APCQNGGICIDGINGYTCSCPLGFSGDNCEN NDDECSS-I PCLNGGTCVDLVNA 349 

Qy 253 --CSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTC-DAATG-QCHCSPGYTGERCQ 308 

I I I I I I I I : I I I i I I I I I I I I I I I : 

Db 350 YMC VC AP GWT G P T C ADN IDE CA- SAPCQNGGVCI DGVNGYMCDCQPGYTGTHCE 402 

Qy 309 DECPVGTYGVLCAETCQCVNGGKCYH-VSG-ACLCEAGFAGERCEARLCPEGLYGIK 363 

III I I I I I I : I i : I I I I I I 

Db 403 TDI DEC ARPPCQNGGDCVDGVNGYVCICAPGFDGLNCE 440 

Qy 364 CDKRCPCHLENTHSC--HPMSGECACKPGWSGLYCNETCSPGFYGEACQ QICS C 415 

I I I I I : I I I I i I : I I : I : I 

Db 441 NNIDECASRPCQNGAVCVDGVNGFVC--TCSAGYTGVLCETDINECASMPC 489 

Qy 416 QNGADC-DSVTGK-CTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCS-PVDG-SC 471 

I I I I I i I I I I I I : I : I I I : I I : I I I : I : I I 

Db 490 LNGGVCTDLWGYI CTCAAGFEGTNCETDTD ECAS-FPCQNGATCTDQVNGYVC 542 

Qy 472 TCKAGWHGVDC SIRCPSG TWGFGCNL TCQ 500 

I I I : I I I I I : I I : I II: 

Db 543 TCVPGYTGVLCETDINECASFPCLNGGTCNDQVNGYVCVCAQDTSVSTCETDRDECASAP 602 

Qy 501 CLNGGAC-NTLDG-TCTCAPGWRGEKCEL PCQDGTYGLNCAERCDCSHADGC 550 

I I I I I I I : : : I I I I I I I i I I : I I : I I I I : : : I : 
Db 603 CLNGGACMDVVNGFVCTCLPGWEGTNCEINT DECAS SPCMNG--GL-CVDQVN-SYV 655 



Qy 551 HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASC— SPDDGICECAPGFRG 608 

i I M I : : I : I I : I I Mil I I II h 

Db 656 CFCLPGFTGIHCGTEIDECASSP CLNGGQCIDRVDS YECVCAAGYTA 7 02 

Qy 609 TTCQ RICSPGFYGHRCSQTCPQCVHSSGPC 638 

|| I : I I : | I : | : | | I 

Db 703 VRCQINIDECASAPCQNGGVCVDGVNGYVCNCAPGYTGDNCETEIDEC — ASMPCLNGGA 760 

Qy 639 --HHITG-LCDCLPGFTGALC NEVCPSGRFGKNCAGICTCTNNGTCNPIDRS 687 

: I | | : | : | I : I : I : I I : I I I I I 
Db 7 61 CIEMVNGYTCQCVAGYTGVICETDIDECASAPCQNG GVCTDTINGYI 8 07 

Qy 688 CQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDG ECKCTPGWTGLYCT 743 

I I I I : I I : I I I I I I II I : I I : : I II 

Db 808 CACVPGFTGSNCETNIDECASDP CLNGGIC — VDGVNGFVCQCPPNYSGTYCE 858 

Qy 744 QRCPLGFYGKDCALICQCQNGADCDHISGQ— CTCRTGFMGRHCE QKCPSGTYGYGC 798 

I MINI:: | | | : | : : | | : | I 
Db 859 1 S L D AC R SMP CQN GAT CVNVGAD YVC E C VP G YAGQN C E I D I N E CAS 904 

Qy 799 RQICDCLNNSTC-DHITG-TCYCSPGWKGARCDQAGVI IVGNL NSLSRTSTA 848 

11 I I I I I I I I : I I : : I : = : : : : | | | 

Db 905 LPCQNGGLCIDGIAGYTCQCRLGYIGVNCEEVGFCDLEGMWYNECNDQVTITKTSTG 961 

Qy 849 L PAD S YQ I GAI AG III L VL WL FL L AL F 1 1 YRH KQ K GKE S SMP AVT YT P AMRWN AD YT I 908 

: : I : : I : I : I I I h 

Db 9 62 M ML G D YMT YN E RAL GYAAP T VWG YAS N NYDFPS 995 

Qy 909 SG-TLPHSNGGNANS 922 

I I : I I : I 

Db 996 FGFTWRDNGQSTTS 1010 



RESULT 15 
T09059 

notch4 - mouse 

C; Species: Mus musculus (house mouse) 

C;Date: ll-Jun-1999 #sequence_revision ll-Jun-1999 #text_change 08-Sep-2002 
C; Accession: TO 90 5 9 

R;Rowen, L . ; Mahairas, G.; Qin, S.; Ahearn, M.E.; Bankers, C; Lasky, S.; 
Loretz, C; Schmidt, S.; Tipton, S.; Traicoff, R. ; Zackrone, K. ; Hood, L. 
submitted to the EMBL Data Library, October 1997 

A; Description : Sequence of the mouse major histocompatibility locus class III 
region . 

A/Reference number: Z16543 
A; Accession: TO 9 05 9 

A;Status: preliminary; translated from GB/EMBL/DDB J 

A;Molecule type: DNA 

A; Residues: 1-1964 <ROW> 

A;Cross-references : EMBL : AF030001 ; NID : g2564 945 ; PID:g2564947 

C; Genetics : 

A; Gene: notch4 

A;Map position: 17 

A;Introns: 22/1; 49/2; 148/1; 264/1; 305/1; 384/1; 436/1; 501/1; 539/1; 577/1; 
618/1; 671/2; 720/1; 771/1; 810/2; 839/3; 890/1; 951/3; 1036/1; 1073/3; 1248/2; 
1376/2; 1435/1; 1508/2; 1531/3; 1578/1; 1679/3; 1729/1; 1761/3 
C; Superf amily : notch protein; ankyrin repeat homology; EGF homology 



C; Keywords: receptor; signal transduction 
F;514-545/Domain: EGF homology <EGF> 



Query Match 14.1%; Score 952.5; DB 2; Length 1964; 

Best Local Similarity 26.5%; Pred. No. 5.4e-43; 

Matches 298; Conservative 68; Mismatches 315; Indels 445; Gaps 65; 

Qy 95 CP-GFYESGEMCVPHCADKCV HGRCIAPNT CQCEPGWGGTNCSSACDGDH 14 3 

I I I I : ! : I I : I II : I 

Db 102 CPSGF--TGDRCQTHLEELCPPSFCSNGGHCYVQASGRPQCSCEPGWTGEQCQLR 154 

Qy 144 WGPHCTSRCQCKNGALCNPITG — ACHCAAGFRGWRCE DRCEQGTYGNDC 191 

I : : I I I : I I I I I I I I 1 I I I I 
Db 155 — DFCSAN-PCANGGVCLATYPQIQCRCPPGFEGHTCERD1NECFLEPGPCPQGT SC 2 08 

Qy 192 HQ RCQ CQNGATCD HVTGE-CRCPPGYTGA 219 

| : I : Mill II 11111:11 

Db 209 HNTLGSYQCLCPVGQEGPQCKLRKGACPPGSCLNGGTCQLVPEGHSTFHLCLCPPGFTGL 2 68 

Q y 220 FCE DLCPPGKHG PQCEQRCP--CQNGGV 245 

II III ! : I I I I I : I I I 

Db 2 69 DCEMNPDDCVRHQCQNGATCLDGLDTYTCLCPKTWKGWDCSEDIDECEARGPPRCRNGGT 32 8 

Qy 246 CHHVTG — ECSCP SGWMGTVCGQP CPEGRFGKNCS 278 

I : I I I I I I I I : I I I I I I 

Db 32 9 CQNTAGS FHCVCVS GWGGAGCEENLDDCAAATCAPGSTCI DRVGS FS CLCP PGRTGLLCH 3 88 

Qy 27 9 QE--C QCHNGGTC— DAATGQ— CHCSPGYTGERCQ DECPVGTYGVLCAETCQC 326 

II II I : II 11111:11 III: I I 

Db 389 LEDMCLSQPCHVNAQCSTNPLTGSTLCICQPGYSGSTCHQDLDECQMAQQG PSPC 4 43 

Qy 327 VNGGKCYHVSGA — CLCEAGFAGERCEAR LCPEGLY 3 60 

: I I I : I : I I I I : I II II Mill 
Db 444 EHGGSC1NTPGSFNCLCLPGYTGSRCEADHNECLSQPCHPGSTCLDLLATFHCLCPPGLE 503 

Qy 361 GIKCD KRC PCHLENTHSCHPMSG— ECACKPGWSGLYCNE 398 

I I : I II I : I I : : I I I I : : I I : 

Db 504 GRLCEVEVNECTSNPC — LNQAACHDLLNGFQCLCLPGFTGARCEKDMDECSSTPCANGG 561 

Qy 399 TCSPGFYGEACQQICS CQNGADCDSVTGK — CTCAPGFKGIDC 439 

I II I I I : : I I I I : I I I II I I I 

Db 5 62 RCRDQPGAFYCECLPGFEGPHCEKEVDECLSDPCPVGASCLDLPGAFFCLCRPGFTGQLC 621 

Qy 440 STP CPLGTYGI NCSSRCG 457 

I I I I : I III 

Db 622 EVPLCTPNMCQPGQQCQGQEHRAPCLCPDGS PGCVPAEDNCPCHHGHCQRSLCVCDEGWT 681 

Qy 458 CKNDAVCSPVDG--SCTCKAGWHGVDCS IRCPSGTW 491 

I : I I : I I I I I : I : I I II I 

Db 682 GPECETELGGCI STPCAHGGTCHPQPSGYNCTCPAGYMGLTCSEEVTACHSGPCLNGGSC 741 

Qy 4 92 GFGCN LTCQCLNGGACNTLDGT--CTCAPGWRGEKCE- 52 6 

I : I :: I I I I I I I I I I I I :: I I I 

Db 742 SIRPEGYSCTCLPSHTGRHCQTAVDHCVSASCLNGGTCVNKPGTFFCLCATGFQGLHCEE 801 

Qy 527 LPCQDGTYGLNC AERCDCSHADGCHPT-- 553 

I I I I I | | | | : 



Db 


802 


KTNPSCADSPCRNKATCQDTPRGARCLCSPGYTGSSCQTLIDLCARKPCPHTARCLQSGP 


DDI 


Qy 


554 


TGHCRCLPGWSGVHCD — SVCAEGRWGPNCSLPCYCKNGASCSPDDG ICECAPGFRG 

: : 1 II 1 : : Ml 1 1 1 i 1 1 1 1 : 1 

SFQCLCLQGWTGALCDFPLSCQKAAMSQGIEISGLCQNGGLCI-DTGSSYFCRCPPGFQG 


c n q 


Db 


862 


92 0 


Qy 


609 


TTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKN 
|| : I 1 INI: | : | : | : | | | | : 1 


668 


Db 


921 


KLCQDNVNP C — EPNPCHHGS TCVPQPSGYVCQ — CAPGYEGQN 


c\ r f\ 

y o u 


Qy 


669 


CAGI C TCTNNGTC--NPIDRSCQCYPGWIGSDC SQPCPPAHWGPNC 


7 12 


Db 


961 


I : : I 1 1 : 1 II 1 1 1 1 1 :: 1 1 -ill: 

CSKVLDACQSQPCHNHGTCTSRPGGFHCACPPGFVGLRCEGDVDECLDRPCHPS 


1014 


Qy 


713 


IHTCNCHNGAFC5AYDGECKCTPGWTGLYC TQ 

1 1 1 : : 1 : 1 : 1 1 1 1 1 1 1 
-GTAACH--SLANAF — YCQCLPGHTGQRCEVEMDLCQSQPCSNGGSCEITTGPPPGFTC 


744 


Db 


1015 


1069 


Qy 


745 


RCPLGFYGKDC ALIC QCQNGADC DHISGQCTCRTGFMGRHC-EQKCPSG 

i 1 1 1 1 1 III 1 1 1 1 1 1 : 1 1 1 1 1 
i i i i i i iii i i i i i i i i i i 

HCPKGFEGPTCSHKALSCGIHHCHNGGLCLPSPKPGSPPLCACLSGFGGPDCLTPPAPP- 


792 


Db 


1070 


1128 


Qy 


793 


TYGYGCRQICDCLNNSTCDHITG TCYCSPGWKGARCDQAG 8 32 

II ! S : 1 1 1 1 III 1 1 1 : 1 
GCGPPSPCLHNGTCTETPGLGNPGFQCTCPPDSPGPRCQRPG 117 0 




Db 


1129 





Search completed: March 26, 2004, 16:12:06 
Job time : 36.117 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: March 26, 2004, 16:11:16 ; Search time 52.8389 Seconds 

(without alignments) 
5645.353 Million cell updates/se 



Title: US-1 0- 0 92-3 9 0-2 

Perfect score: 6744 

Sequence: 1 MVISLNSCLSFICLLLCHWI . SSPKQEDSGGSSSNSSSSSE 1140 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 1065169 seqs, 261661801 residues 

Total number of hits satisfying chosen parameters: 1065169 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



Database : Published_Applications_AA: * 

1: / cgn2_6/ptodata/2 /pubpaa/US 07_PUBCOMB. pep: * 
2 : /cgn2_6/ptodata/2/pubpaa/PCT_NEW_PUB.pep: * 
3 : /cgn2_6/ptodata/2/pubpaa/US06_NEW_PUB.pep: * 
4 : /cgn2__6/ptodata/2/pubpaa/US06_PUBCOMB.pep: * 
5: /cgn2_6/ptodata/2/pubpaa/US07_NEW_PUB.pep: * 
6 : /cgn2_6/ptodata/2/pubpaa/ PCTUS_PUBCOMB . pep : * 

7 : / cgn2__6/ p todata / 2 / pubpaa/US 0 8_NEW_PUB . pep : * 

8 : /cgn2__6/ptodata/2/pubpaa/US08_PUBCOMB.pep: * 

9 : /cgn2_6/ptodata/2 /pubpaa/US 09A__PUBCOMB . pep : * 
10: /cgn2_6/ptodata/2/pubpaa/US09B_PUBCOMB.pep: * 
11 : /cgn2_6/ptodata/2 /pubpaa/US 0 9C_PUBCOMB . pep : * 
12 : /cgn2__6/ptodata/2/pubpaa/US09_NEW_PUB.pep: * 
13: /cgn2_6/ptodata/2/pubpaa/US10A_PUBCOMB.pep: * 
14 : /cgn2_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep: * 
15 : / cgn2_6/ptodata/2 /pubpaa/US 10C_PUBCOMB . pep : * 
16: /cgn2_6/ptodata/2/pubpaa/US10_NEW_PUB.pep : * 
17 : /cgn2_6/ptodata/2/pubpaa/US60_NEW_PUB.pep: * 
18 : /cgn2_6/ptodata/2/pubpaa/US60_PUBCOMB.pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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Sequence 


29, Appl 


37 


1024 


15. 


2 


2531 


15 


us- 


10- 


369- 


-072-29 


Sequence 


29, Appl 


38 


1014.5 


15. 


0 


1473 


15 


us- 


10- 


190- 


-115-4 


Sequence 


4, Appli 


39 


1014.5 


15. 


0 


1473 


15 


us- 


■10- 


369- 


-072-4 


Sequence 


4, Appli 


40 


1011 


15. 


0 


241 


14 


us- 


•10- 


084- 


■994-8 


Sequence 


8, Appli 


41 


1011 


15 


0 


241 


14 


us- 


■10- 


-193- 


-109-8 


Sequence 


8, Appli 


42 


1011 


15 


0 


241 


15 


us- 


-10- 


193- 


-409-8 


Sequence 


8, Appli 


43 


998 


14 


8 


2471 


15 


us- 


■10- 


190- 


■115-27 


Sequence 


27, Appl 


44 


998 


14 


8 


2471 


15 


us- 


-10- 


-369- 


-072-27 


Sequence 


27, Appl 


45 


980 


14 


5 


2469 


15 


us- 


-10- 


-190- 


-115-2 


Sequence 


2, Appli 



ALIGNMENTS 



RESULT 1 
US-10-092-390-2 

; Sequence 2, Application US/10092390 

; Publication No. US20030013865A1 

; GENERAL INFORMATION: 

; APPLICANT: Yu, Xuanchuan 



; APPLICANT: Miranda, Maricar 

; TITLE OF INVENTION: No. US2 0 03 00138 65A1 el Human EGF- Family Proteins and 
Polynucleotides Encoding the Same 
FILE REFERENCE: LEX-0317-USA 
CURRENT APPLICATION NUMBER: US/ 10/092 , 390 
CURRENT FILING DATE: 2002-03-06 
PRIOR APPLICATION NUMBER: US 60/275,013 
PRIOR FILING DATE: 2001-03-12 
NUMBER OF SEQ ID NOS : 4 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 2 
LENGTH: 1140 
TYPE: PRT 

ORGANISM: homo sapiens 
US-10-092-390-2 

Query Match 100.0%; Score 6744; DB 14; Length 1140; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1140; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSC 60 

| M | | | | | | | | | | | | I I I I II I I I I I I I I I I I I I I I I I I II M I I M I I I I I I I I I I I I I 

Db 1 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQES YPHPFDQI YYTSC 60 

Qy 61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I 

Db 61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 

Qy 121 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 180 

| M | I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I II I I I I II I I I I I I I I I I I I I I I 

Db 121 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 180 

Q y 181 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 240 

I | | I I I I I II I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I I I I I I 

D b 181 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 240 

Qy 241 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 

| || || | | | I I I I I I I I I I I M I 1 I I I I II I II I I I M I I I I I I I I I M II II I I I I I I I I 

Db 241 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 3 00 

Qy 301 GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

| | | | I I I I I | I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 360 

Qy 361 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 42 0 

|| | | | | | || I II I I I I I I I I I I I I I I I I I II I I I I I I II I I II I I M I I I 

Db 361 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 420 

Q y 421 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 480 

I i I | | I I I I I I I I II M I I I II I I I I II M I I I I I II I I I I I I I M I I I I I I M I I I I I I 

Db 421 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 480 

Q y 481 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 540 

| | | | | | | | I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I M I II II I I I II I I I I I I 
Db 481 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 540 



Qy 



541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 60 0 



Db 


541 


RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 


600 


Qy 


601 


ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 


660 






1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 




Db 


601 


ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 


660 


Qy 


661 


PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 


720 




1 1 1 I I I I 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 ! 




Db 


661 


PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 


720 


Qy 


721 


GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 


780 






I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


721 


GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 


780 


Qy 


781 


MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 


840 






1 1 1 M 1 1 M 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 M 1 1 I 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 




Db 


781 


MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 


840 


Qy 


841 


SLSRTSTALPADSYQIGAIAGII ILVLWLFLLALF1IYRHKQKGKESSMPAVTYTPAMR 


900 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! II 1 1 1 1 II 1 1 II 1 II 1 1 1 1 II 1 1 1 




Db 


841 


SLSRTSTALPADSYQIGAIAGII I LVLWLFLLALFI I YRHKQKGKESSMPAVTYTPAMR 


900 


Qy 


901 


WNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFV 


960 




I I 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


901 


WNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFV 


960 


Qy 


961 


NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 


1020 






1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 I! 1 1 1 




Db 


961 


NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 


1020 


Ov 


1021 


CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 


1080 




1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 M M II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 




Db 


1021 


CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 


1080 


Qy 


1081 


VSWQGVFSNNGRLSQDPYDLPKNSHI PCHYDLLPVRDSSSSPKQEDSGGSSSNSS SSSE 


1140 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 I! 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 




Db 


1081 


VSWQGVFSNNGRLSQDPYDLPKNSHI PCHYDLLPVRDS SSSPKQEDSGGSSSNSSSS SE 


1140 



RESULT 2 

US-10-052-648A-33 

Sequence 33, Application US/10052648A 
Publication No. US2 004 000555 8A1 
GENERAL INFORMATION: 
APPLICANT: Anderson, David 
APPLICANT: Burgess, Catherine 
APPLICANT: Casman, Stacie 
APPLICANT: Colman, Steven 
APPLICANT: Edinger, Shlomit R. 
APPLICANT: Ellerman, Karen 
APPLICANT: Gerlach, Valerie 
APPLICANT: Gunther, Erik 
APPLICANT: Kekuda, Ramesh 
APPLICANT: MacDougall, John R. 
APPLICANT: Mehraban, Fuad 
APPLICANT: Patturajan, Meera 



APPLICANT: Rothenberg, Mark 
APPLICANT: Shimkets, Richard 
APPLICANT : Smithson, Glennda 
APPLICANT: Spytek, Kimberly A. 
APPLICANT: Stone, David J. 
APPLICANT: Vernet, Corine A.M. 
APPLICANT: Zerhusen, Bryan D. 

TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-250 (CURA-550) 
CURRENT APPLICATION NUMBER: US/ 10 / 052 , 64 8A 
CURRENT FILING DATE: 2002-12-09 
PRIOR APPLICATION NUMBER: 60/262,454 
PRIOR FILING DATE: 2001-01-18 
PRIOR APPLICATION NUMBER: 60/272,920 
PRIOR FILING DATE: 2001-03-02 
PRIOR APPLICATION NUMBER: 60/284,549 
PRIOR FILING DATE: 2001-04-18 
PRIOR APPLICATION NUMBER: 60/303,229 
PRIOR FILING DATE: 2001-07-05 
PRIOR APPLICATION NUMBER: 60/262,892 
PRIOR FILING DATE: 2001-01-19 
PRIOR APPLICATION NUMBER: 60/263,605 
PRIOR FILING DATE: 2001-01-23 
PRIOR APPLICATION NUMBER: 60/269,098 
PRIOR FILING DATE: 2001-02-15 
PRIOR APPLICATION NUMBER: 60/264,159 
PRIOR FILING DATE: 2001-01-25 
PRIOR APPLICATION NUMBER: 60/265,517 
PRIOR FILING DATE: 2001-01-31 
PRIOR APPLICATION NUMBER: 60/271,855 
PRIOR FILING DATE: 2001-02-27 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 97 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 33 
LENGTH: 114 0 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-052-648A-33 

Query Match 100.0%; Score 6744; DB 15; Length 1140; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1140; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQI YYTSC 60 

I I I I I I I I I I I I I I I I I I I I I I I II I I II II I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I 
MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQI YYTSC 60 

TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 

I M M I I I I I I I I I I I I I I I I M M I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 
PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 18 0 

I M | | | | I I I I I I I I I I I I I I II II I I I I II I I M I I I I I I I I I I I I I I I II I I M I I I I 

PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 18 0 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 



Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 


Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 


Qy 


901 


Db 


901 


Qy 


961 


Db 


961 



RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 24 0 

I I I I I I I I I I II I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I 

RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 24 0 

QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 30 0 

I I I I I M I I I I M I I I I M I I II I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M 

QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 30 0 

GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

I I I I I I I I I II II I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I M I 

GYTGERCQDECPVGTYGVLCAETCQCWGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 42 0 

| I I I I I | I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 42 0 

CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 480 

I I M I I I | I I I I I I I II I I I II I I I I I I I I I I M I I I I I II I I I I II I I I I I I I I I I I I I 

CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 48 0 

DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 54 0 

M I M I I I II II I I I I II I I I I II I I I I I M I I I I I I I M I I I I I I I I I I M I I I I M I I 

DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 54 0 

RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 60 0 

I | | | M | | I I I II I II I I I I I I I I I II I I II I I I I I I I I I I I I I I I I M I I I I I I I II I I 

RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 60 0 

ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 660 

I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I II I 



PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 72 0 

I I I I II I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 72 0 



I I I I I I I I I I I II I I I I II I II I I I I I I I I I II I I I II I I I I I I I I II I I I I I I I I I M I 

GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 7 80 



| | M | | | | | I M I I I II I I M I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I 



I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I II II I I I I I I I I I I I I I I 



VWADYTISGTLPHSNGGNANSHYFTNPS YHTLTQCATSPHVNNRDRMTVTKSKNNQLFV 960 

I | I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I II I I I II I I I II I I M I I 



NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 102 0 

I I II II I I I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I II I I I I I I M I I 

MT.KNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 102 0 



Qy 



1021 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 10 80 



I I I I 1 I I I I 1 I I II I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I MINI 

Db 1021 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 1080 

Qy 1081 VSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 

111 1 I I | | I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 VSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 



RESULT 3 

US-10-052-648A-34 

Sequence 34, Application US/10052648A 
Publication No. US2 004 0005558A1 
GENERAL INFORMATION: 
APPLICANT: Anderson, David 
APPLICANT: Burgess, Catherine 
APPLICANT: Casman, Stacie 
APPLICANT: Colman, Steven 
APPLICANT: Edinger, Shlomit R. 
APPLICANT: Ellerman, Karen 
APPLICANT: Gerlach, Valerie 
APPLICANT: Gunther, Erik 
APPLICANT: Kekuda, Ramesh 
APPLICANT: MacDougall, John R. 
APPLICANT: Mehraban, Fuad 
APPLICANT: Patturajan, Meera 
APPLICANT: Rothenberg, Mark 
APPLICANT: Shimkets, Richard 
APPLICANT: Smithson, Glennda 
APPLICANT: Spytek, Kimberly A. 
APPLICANT: Stone, David J. 
APPLICANT: Vernet, Corine A.M. 
APPLICANT: Zerhusen, Bryan D. 

TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-250 (CURA-550) 
CURRENT APPLICATION NUMBER: US/ 1 0/ 052 , 64 8A 
CURRENT FILING DATE: 2002-12-09 
PRIOR APPLICATION NUMBER: 60/262,454 
PRIOR FILING DATE: 2001-01-18 
PRIOR APPLICATION NUMBER: 60/272,920 
PRIOR FILING DATE: 2001-03-02 
PRIOR APPLICATION NUMBER: 60/284,549 
PRIOR FILING DATE: 2001-04-18 
PRIOR APPLICATION NUMBER: 60/303,229 
PRIOR FILING DATE: 2001-07-05 
PRIOR APPLICATION NUMBER: 60/262,892 
PRIOR FILING DATE: 2001-01-19 
PRIOR APPLICATION NUMBER: 60/263,605 
PRIOR FILING DATE: 2001-01-23 
PRIOR APPLICATION NUMBER: 60/269,098 
PRIOR FILING DATE: 2001-02-15 
PRIOR APPLICATION NUMBER: 60/264,159 
PRIOR FILING DATE: 2001-01-25 
PRIOR APPLICATION NUMBER: 60/265,517 
PRIOR FILING DATE: 2001-01-31 
PRIOR APPLICATION NUMBER: 60/271,855 
PRIOR FILING DATE: 2001-02-27 



; Remaining Prior Application data removed - See File Wrapper or PALM. 
; NUMBER OF SEQ ID NOS : 97 

SOFTWARE: PatentlnVer. 2.1 
; SEQ ID NO 34 

LENGTH: 969 
; TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-052-648A-34 



Query Match 55.9%; Score 3769; DB 15; Length 969; 

Best Local Similarity 58.6%; Pred. No. 1.2e-241; 

Matches 600; Conservative 126; Mismatches 208; I ridels 90; Gaps 



Qy 


109 


CADKCVHGRCIAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACH 

| : : I I I I 1 1 : : 1 : 1 I 1 1 1 ! 1 1 1 : 1 1 1 II 1 1 1 1 I M : : 1 1 1 i : 1 1 M 1 1 1 1 1 1 1 1 
CTEECVHGRCVSPDTCHCEPGWGGPDCSSGCDSDHWGPHCSNRCQCQNGALCNPITGACV 


168 


Db 


28 


87 


Qy 


169 


CAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPG 

M | | | | | I I I I : I 1 1 : 1 1 1 1 1 : : 1 1 : 1 1 II 1 1 1 1 1 M : 1 1 : 1 1 1 1 1 
CAAGFRGWRCEELCAPGTHGKGCQLPCQCRHGASCDPRAGECLCAPGYTGVYCEELCPPG 


228 


Db 


88 


147 


Qy 


229 


KHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGT 

11 II 1 1 1 1 1 1 1 1 Mill: M 1 M MM I 1 1 M 1 M M 1 1 M 1 

SHGAHCELRCPCQNGGTCHHITGECACPPGWTGAVCAQPCPPGTFGQNCSQDCPCHHGGQ 


288 


Db 


148 


207 


Qy 


289 


CDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGE 

II 1 1 M 1 1 : 1 1 1 M 1 1 M 1 1 1 : M | : : | 1 M 1 M M 1 1 1 1 1 : 1 
CDHVTGQCHCTAGYMGDRCQEECPFGSFGFQCSQRCDCHNGGQCSPTTGACECEPGYKGP 


348 


Db 


208 


267 


Qy 


349 


RCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEA 

M : 1 1 M 1 1 1 M 1 1 I 1 M 1 1 1 1 1 : M 1 1 M 1 1 1 1 M 1 1 M 1 M 1 : 

RCQERLCPEGLHGPGCTLPCPCDADNTISCHPVTGACTCQPGWSGHHCNESCPVGYYGDG 


408 


Db 


268 


327 


Qy 


409 


CQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVD 
M 1 M 1 1 1 1 II 1 M 1 1 1 M : 1 1 1 1 1 1 II 1 1 1 1 IMM 

CQLPCTCQNGADCHSITGGCTCAPGFMGEVCAVSCAAGTYGPNCSSICSCNNGGTCSPVD 


468 


Db 


328 


387 


Qy 


& fi Q 


(^srTrKAGWHGVDGSTRCPSGTWGFGCNLTCOCLNGGACNTLDGTCTCAPGWRGEKCELP 

WWW W IMM: M II II 1 II M 1 II II: : M M M III 1 : MM 

GSCTCKEGWQGLDCTLPCPSGTWGLNCNESCTCANGAACSPIDGSCSCTPGWLGDTCELP 


528 


Db 


388 


447 


QY 


529 


CQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCK 

I 1 | | :| I 1 1 : 1 II II 1 II 1 1 1 1 1 M M IMM Ml 1 1 II II II 1 : 1 M 
CPDGTFGLNCSEHCDCSHADGCDPVTGHCCCLAGWTGIRCDSTCPPGRWGPNCSVSCSCE 


588 


Db 


448 


507 


QY 


589 


NGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCL 

II II M M 1 II 1 1 1 1 M 1 MM 1 M II 1 1 II II : 1 : 1 M 1 

NGGSCSPEDGSCECAPGFRGPLCQRICPPGFYGHGCAQPCPLCVHSSRPCHHISGICECL 


648 


Db 


508 


567 


QY 


649 


PGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHW 

MM : : : II MM II 1 1 1 M II II II M 1 II Ill 1 

PGFSGALCNQVCAGGYFGQDCAQLCSCANNGTCSPIDGSCQCFPGWIGKDCSQACPPGFW 


708 


Db 


568 


627 


Qy 


709 


GPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCD 

I | I I I M 1 || I II 1 1 1 1 1 II 1 M M : 1 M 1 1 1 MUM : II 1 II M II 
GPACFHACSCHNGASCSAEDGACHCTPGWTGLFCTQRCPAAFFGKDCGRVCQCQNGASCD 


768 


Db 


628 


687 



Qy 7 69 HISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARC 82 8 

! I I I : I I I I I I I I : I ! I I : I I I : I I I I : I : I : I : I I I I I I I : I I I I I I I I I : I I I I 
Db 68 8 HISGKCTCRTGFTGQHCEQRCAPGTFGYGCQQLCECMNNSTCDHVTGTCYCSPGFKGIRC 7 47 

Qy 829 DQAGVIIVGNLNSLSRTSTALPADSYQIGAIAGIIILVLWLFLLALFII YRHKQKGKES 888 

Ml : : : I I : : I I I I : : : I I : I i : : I : : : I I I I : I : I I I 

Db 74 8 DQA-ALMMEELNPYTKISPALGAERHSVGAVTGIMLLLFFIWLLGLFAWHRRRQKEKGR 8 06 

Qy 889 SM-PAVTYTPAMRWNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDR 947 

: I hlillll: : | | : : | 
Db 8 07 DLAPRVS YT PAMRMT STDYS LS 82 8 

Qy 948 MTVTKSKNNQLFVNLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRS YM 1003 

IIIMI I 

D b 82 9 GACGMDRRQNTYIM 8 42 

Qy 1004 GKSLKDLGKNSEYNSSNCSLSSSENPYATI KDPPVLIPKSSECGYVEMKSPARRDSPYAE 1063 

I II II : I I I I I : I I I I I I I II I I I I : I I I I I I I II I III: 
Db 843 DKGFKDYMKESVCSSSTCSLNSSENPYATIKDPPILTCKLPESSYVEMKSPVHMGSPYTD 902 

Qy 1064 INNSTSANRNVYEVEPTVSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSP 1123 

: : : : : I : I : I I I I I I I I I I I : I I : I I I I : I I I I I I I I I I I I I I 

Db 903 VPSLSTSNKNI YEVEPTVSWQEGCGHNSSYIQNAYDLPRNSHIPGHYDLLPVRQSPANG 962 

Qy 1124 KQED 1127 

: I 

Db 963 PSQD 966 



RESULT 4 

US-10-052-648A-35 

; Sequence 35, Application US/10052648A 
; Publication No. US20040005558A1 
; GENERAL INFORMATION: 



APPLICANT: 


Anderson, 


David 


APPLICANT: 


Burgess , 


Catherine 


APPLICANT: 


Casman, 


Stacie 


APPLICANT: 


Colman, 


Steven 


APPLICANT : 


Edinger , 


Shlomit R. 


APPLICANT: 


Ellerman 


, Karen 


APPLICANT 


Gerlach, 


Valerie 


APPLICANT 


Gunther , 


Erik 


APPLICANT 


Kekuda, 


Ramesh 


APPLICANT 


MacDougall, John R. 


APPLICANT 


Mehraban 


, Fuad 


APPLICANT 


Pattura j 


an, Meera 


APPLICANT 


Rothenbe 


rg, Mark 


APPLICANT 


Shimkets 


, Richard 


APPLICANT 


: Smithson 


, Glennda 


APPLICANT 


: Spytek, 


Kimberly A. 


APPLICANT 


: Stone, David J. 


APPLICANT 


: Vernet, 


Corine A.M. 


APPLICANT 


: Zerhusen 


, Bryan D. 



TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-250 (CURA-550) 
CURRENT APPLICATION NUMBER: US/ 1 0/ 052 , 64 8A 



; CURRENT FILING DATE: 2002-12-09 

PRIOR APPLICATION NUMBER: 60/262,454 

PRIOR FILING DATE: 2001-01-18 
; PRIOR APPLICATION NUMBER: 60/272,920 

PRIOR FILING DATE: 2001-03-02 

PRIOR APPLICATION NUMBER: 60/284,549 

PRIOR FILING DATE: 2001-04-18 

PRIOR APPLICATION NUMBER: 60/303,229 

PRIOR FILING DATE: 2001-07-05 

PRIOR APPLICATION NUMBER: 60/262,892 

PRIOR FILING DATE: 2001-01-19 

PRIOR APPLICATION NUMBER: 60/263,605 
; PRIOR FILING DATE : 2001-01-23 

PRIOR APPLICATION NUMBER: 60/269,098 
; PRIOR FILING DATE: 2001-02-15 

PRIOR APPLICATION NUMBER: 60/264,159 

PRIOR FILING DATE: 2001-01-25 
; PRIOR APPLICATION NUMBER: 60/265,517 
; PRIOR FILING DATE: 2001-01-31 
; PRIOR APPLICATION NUMBER: 60/271,855 

PRIOR FILING DATE: 2001-02-27 
; Remaining Prior Application data removed - See File Wrapper or PALM. 
; NUMBER OF SEQ ID NOS : 97 

SOFTWARE: PatentlnVer. 2.1 
; SEQ ID NO 35 

LENGTH: 9 69 
; TYPE: PRT 

; ORGANISM: Homo sapiens 

; FEATURE : 

; NAME/ KEY: VARIANT 

LOCATION: (848) . . (889) 
; OTHER INFORMATION: Where Xaa is any amino acid 
US-10-052-648A-35 



Query Match 53.4%; Score 3603; DB 15; Length 969; 

Best Local Similarity 56.0%; Pred. No. 1.2e-230; 

Matches 573; Conservative 124; Mismatches 237; Indels 90; Gaps 4; 



Qy 


109 


CADKCVHGRCIAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACH 

I : : | | | M 1 : : 1 : 1 1 1 1 1 1 1 1 1 : 1 1 1 II 1 M 1 1 1 1 : : 1 1 1 1 : 1 1 1 1 1 1 1 1 M 1 1 
CTEECVHGRCVSPDTCHCEPGWGGPDCSSGCDSDHWGPHCSNRCQCQNGALCNPITGACV 


168 


Db 


28 


87 


QY 


169 


CAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPG 

11111111111:1 1 1 : 1 1 1 1 1 : : 1 1 : II 1 1 1 1 1 1 II 1 : 1 1 : 1 1 II 1 

CAAGFRGWRCEELCAPGTHGKGCQLPCQCRHGASCDPRAGECLCAPGYTGVYCEELCPPG 


228 


Db 


88 


147 


Qy 


229 


KHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGT 

II II 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 : 1 1 II 1 II MM 1 1 1 M 1 II M II M 1 
SHGAHCELRCPCQNGGTCHHITGECACPPGWTGAVCAQPCPPGTFGQNCSQDCPCHHGGQ 


288 


Db 


148 


207 


Qy 


289 


CDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGE 

II 1 II 1 M : II 1 M 1 1 M 1 1 1 : M 1 :: 1 1 II 1 M M 1 1 M M i 

CDHVTGQCHCTAGYMGDRCQEECPFGSFGFQCSQHCDCHNGGQCSPTTGACECEPGYKGP 


348 


Db 


208 


267 


Qy 


349 


RCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEA 

II: II 1 1 M 1 : 1 1 III Ml 1 1 M :: 1 1 1 M 1 1 1 1 MUM MM 
RCQERLCPEGLHGPGCTLPCPCDADNTISCHPVTGACTCQPGWSGHHCNESCPVGYYGDG 


408 


Db 


268 


327 



Qy 409 CQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVD 468 

II I : I I I I I I I I : I I I I I I I I I I I : I I I I I I I I I i I I I I I I I 
Db 328 CQLPCTCQNGADCHSITGGCTCAPGFMGEVCAVSCAAGTYGPNCSSICSCNNGGTCSPVD 387 

Qy 4 69 GSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELP 52 8 

I | | | | | | | | : | | : : I I I I I I I II : I I II II- : I I : I : I III M I I I I 

Db 388 GSCTCKEGWQGLDCTLPCPSGTWGLNCNESCTCANGAACSPIDGSCSCTPGWLGDTCELP 447 

Qy 52 9 CQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCK 58 8 

I | | | : | | | I : I I I I I I I I I I I I I I I M I I : I : Ml I MINIM: | : 
Db 448 CPDGTFGLNCSEHCDCSHADGCDPVTGHCCCLAGWTGIRCDSTCPPGRWGPNCSVSCSCE 507 

Qy 589 NGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCL 64 8 

I I II I I : I I II I i I IMII I II I I I : I : I : I I 

Db 508 NGGSCSPEDGSCECAPGFRGPLCQRICPPGFYGHGCAQPCPLCVHSSRPCHHISGICECL 567 

Qy 64 9 PGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHW 7 08 

| | | : | | | | | : | | | | | :: | | : I : I I I I I I : I I I I I I I : I I I I I M I I I M I 
Db 568 PGFSGALCNQVCAGGYFGQDCAQLCSCANNGTCSPIDGSCQCFPGWIGKDCSQACPPGFW 627 

Qy 709 GPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCD 768 

I I I I I : I I I II I I I I I I I I I I I M I : M I I I I MINI MINIUM 
Db 628 GPACFHACSCHNGASCSAEDGACHCTPGWTGLFCTQRCPAAFFGKDCGRVCQCQNGASCD 687 

Qy 769 HISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARC 828 

| | | | : | | | || I I I : I I I I M II M I I M I : I M M II I I I I M I I I I I I I I M I I I 

Db 688 HISGKCTCRTGFTGQHCEQRCAPGTFGYGCQQLCECMNNSTCDHVTGTCYCSPGFKGIRC 747 

Qy 829 DQAGVIIVGNLNSLSRTSTALPADSYQIGAIAGIIILVLWLFLLALFIIYRHKQKGKES 888 

I | | : : : II : : I I I M : M M I M M : : : : I I N M Ml I 

Db 74 8 DQA-ALMMEELNPYTKISPALGAERHSVGAVTGIMLLLFLIWLLGLFAWHRRRQKEKGR 8 06 

Qy 889 SM-PAVTYTPAMRWNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDR 947 

: I I M I I I I I : : I M M 

Db 8 07 DLAPRVSYTPAMRMTSTDYSLS 82 8 

Qy 948 MTVTKSKNNQLFVNLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRS YM 1003 

II MM I 

Db 82 9 GACGMDRRQNTYIM 8 42 

Qy 1004 GKSLKDLGKNSEYNSSNCSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAE 1063 

|| I I I I I I I : 

Db 843 DKGFKXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXMKSPVHMGSPYTD 902 

Qy 1064 INNSTSANRNVYEVEPTVSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSP 1123 

: : : : : I : I M I M I I I I II I M I : I I I I M I I I I I II I : : 

Db 903 VPSLSTSNKNIYEVEPTVSWQEGCGHNSSYIQNAYDLPRNSHIPGHYDLLPVRQSPANG 962 

Qy 1124 KQED 1127 

: I 

Db 963 PSQD 966 



RESULT 5 
US-10-092-390-4 

; Sequence 4, Application US/10092390 



Publication No. US20030013865A1 
GENERAL INFORMATION: 
APPLICANT: Yu, Xuanchuan 
APPLICANT: Miranda, Maricar 

TITLE OF INVENTION: No. US2 00300138 65Alel Human EGF-Family Proteins and 
Polynucleotides Encoding the Same 
FILE REFERENCE: LEX-0317-USA 
CURRENT APPLICATION NUMBER: US/10/092, 390 
CURRENT FILING DATE: 2002-03-06 
PRIOR APPLICATION NUMBER: US 60/275,013 
PRIOR FILING DATE: 2001-03-12 
NUMBER OF SEQ ID NOS : 4 

SOFTWARE: Fast SEQ for Windows Version 4.0 
SEQ ID NO 4 
LENGTH: 58 6 
TYPE: PRT 

ORGANISM: homo sapiens 
US-10-092-390-4 

Query Match 53.4%; Score 3601; DB 14; Length 586; 

Best Local Similarity 100.0%; Pred. No. 9.3e-231; 

Matches 586; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWES YSVTVQES YPHPFDQIYYTSC 60 

M | I I I I M I I I I I I I I I I I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I I M I I I I I 

MVI SLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWES YSVTVQES YPHPFDQIYYTSC 60 
TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 120 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I M I I M I I M 

TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 
PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 180 

I I I M I I I I I I I I I I I I I I I I I I II I II I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I 

PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 18 0 
RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 240 

I | | | | | | | M I I II I I I II I I I I I I I I I I I I I I I I II I I I I I M I I I I I II I I I I I I I I I 

RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 24 0 
QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 

| I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I 

QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 
GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

I | | | | | I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I M I I I I II I I I I I I I M I I I 

GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 
GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 42 0 

I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I 

GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 42 0 
CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCS SRCGCKNDAVCSPVDGSCTCKAGWHGV 48 0 

II | | | | I I I I I I I M I I I I I I I II I II I I I I I II I II I I I I I I I I M I I I I I I I I I I I I I 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 



QY 



481 DCSIRCPSGTWGE"GCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 540 
I | | | I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 1 I I I II I I I I I I I I I M I 



Db 4 81 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 54 0 

Qy 541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCY 586 

I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I M 

Db 541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCY 58 6 



RESULT 6 

US-10-365-227-20 

Sequence 20, Application US/10365227 
Publication No. US20030143632A1 
GENERAL INFORMATION: 
APPLICANT: McCarthy, Sean A. 
APPLICANT: Holtzman, Douglas A. 
APPLICANT: Goodearl, Andrew D.J. 

TITLE OF INVENTION: NOVEL GENES ENCODING PROTEINS HAVING 

TITLE OF INVENTION: PROGNOSTIC, DIAGNOSTIC, PREVENTIVE, THERAPEUTIC AND 
OTHER 

TITLE OF INVENTION: USES 
FILE REFERENCE: 07334-323001 
CURRENT APPLICATION NUMBER: US/ 10/3 65 , 227 
CURRENT FILING DATE: 2003-02-12 
PRIOR APPLICATION NUMBER: US/ 09/ 8 02 , 582 
PRIOR FILING DATE: 2001-03-08 
PRIOR APPLICATION NUMBER: US 09/128,709 
PRIOR FILING DATE: 1998-08-04 
PRIOR APPLICATION NUMBER: US 60/054,645 
PRIOR FILING DATE: 1997-08-04 
PRIOR APPLICATION NUMBER: US 09/130,491 
PRIOR FILING DATE: 1998-08-06 
PRIOR APPLICATION NUMBER: US 60/054,966 
PRIOR FILING DATE: 1997-08-06 
PRIOR APPLICATION NUMBER: US 60/058,108 
PRIOR FILING DATE: 1997-09-05 
PRIOR APPLICATION NUMBER: US 09/388,280 
PRIOR FILING DATE: 1999-09-01 
PRIOR APPLICATION NUMBER: US 09/388,279 
PRIOR FILING DATE: 1999-09-01 
NUMBER OF SEQ ID NOS : 20 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 20 
LENGTH: 601 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-365-227-20 



Query Match 50. 4%; 

Best Local Similarity 99.7%; 
Matches 571; Conservative 



Score 3399; DB 14; 
Pred. No. 2.4e-217; 
0; Mismatches 2; 



Length 601; 



Indels 



0; Gap^ 



0; 



QY 
Db 



4 06 GEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCS 4 65 
I I M I I I I I I 1 ! I 1 I I I I I I I t I II I I I II I I I I I M I I I I I I I I I I I I I I I I I I I I II I 
1 GEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCS 60 



Qy 



Db 



4 66 PVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKC 52 5 

I I I I II I I I I II I I I I I I I I I I I II I I I I I I I I I I I I II I II I I I I I I II I 1 M I I I I I I 

61 PVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKC 120 



Qy 


526 


Db 


121 


Qy 


586 


Db 


181 


Qy 


646 


Db 


241 


Qy 


706 


Db 


301 


Qy 


766 


Db 


361 


Qy 


826 


Db 


421 


Qy 


8 8 6 


Db 


481 


Qy 


946 


Db 


541 



ELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPC 58 5 

I I I I I MINI I I M I 1 I 1 II I I 

ELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPC 18 0 

YCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLC 645 

I I 1 I 1 I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I M M I I I I I I I I II I I I I I I 

YCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLC 240 

DCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPP 7 05 

| | | | | | | I I I I I | I I I II II II I I I I II 1 I I I I I I I I I I I M I M I I I I I I I I I I I I I I I 

DCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPP 300 



I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I II I I I I I 

AHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGA 3 60 

DCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKG 82 5 

I || | M I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

DCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKG 42 0 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I 



M I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

KESSMPAVTYTPAMRWNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNR 54 0 
DRMTVTKSKNNQLFWLKNVNPGKRGPVGDCTG 97 8 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

DRMTVTKSKNNQLFVNLKNVNPGKRGPVGDCMG 57 3 



RESULT 7 

US-10-052-648A-10 

; Sequence 10, Application US/10052648A 
; Publication No. US20040005558A1 
; GENERAL INFORMATION: 



APPLICANT: 


7\nderson, 


David 


APPLICANT: 


Burgess , 


Catherine 


APPLICANT: 


Gasman, 


Stacie 


APPLICANT: 


Colman, 


Steven 


APPLICANT: 


Edinger , 


Shlomit R. 


APPLICANT: 


Ellerman 


, Karen 


APPLICANT 


Gerlach, 


Valerie 


APPLICANT 


Gunther , 


Erik 


APPLICANT 


Kekuda, 


Ramesh 


APPLICANT 


MacDouga 


11, John R. 


APPLICANT 


Mehraban 


, Fuad 


APPLICANT 


Pattura j 


an, Meera 


APPLICANT 


Rothenbe 


rg, Mark 


APPLICANT 


Shimkets 


, Richard 


APPLICANT 


: Srnithson 


, Glennda 


APPLICANT 


: Spytek, 


Kimberly A. 


APPLICANT 


: Stone, E 


avid J. 


APPLICANT 


: Vernet, 


Corine A.M. 



; APPLICANT: Zerhusen, Bryan D. 

; TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 

; TITLE OF INVENTION: USING THE SAME 

; FILE REFERENCE: 21402-250 (CURA-550) 

; CURRENT APPLICATION NUMBER: US/ 1 0/ 052 , 64 8A 

; CURRENT FILING DATE: 2002-12-09 

; PRIOR APPLICATION NUMBER: 60/262,454 

; PRIOR FILING DATE: 2001-01-18 

; PRIOR APPLICATION NUMBER: 60/272,920 

; PRIOR FILING DATE: 2001-03-02 

; PRIOR APPLICATION NUMBER : 60/284,549 

PRIOR FILING DATE: 2001-04-18 

; PRIOR APPLICATION NUMBER: 60/303,229 

; PRIOR FILING DATE: 2001-07-05 

; PRIOR APPLICATION NUMBER: 60/262,892 

; PRIOR FILING DATE: 2001-01-19 

; PRIOR APPLICATION NUMBER: 60/263,605 

; PRIOR FILING DATE: 2001-01-23 

; PRIOR APPLICATION NUMBER: 60/269,098 

; PRIOR FILING DATE: 2001-02-15 

; PRIOR APPLICATION NUMBER: 60/264,159 

; PRIOR FILING DATE: 2001-01-25 

; PRIOR APPLICATION NUMBER: 60/265,517 

; PRIOR FILING DATE: 2001-01-31 

; PRIOR APPLICATION NUMBER: 60/271,855 

; PRIOR FILING DATE: 2001-02-27 

; Remaining Prior Application data removed - See File Wrapper or PALM. 

; NUMBER OF SEQ ID NOS : 97 

SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 10 

LENGTH: 1037 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-052-648A-10 

Query Match 39.8%; Score 2687; DB 15; Length 1037; 

Best Local Similarity 43.6%; Pred. No. 8e-170; 

Matches 506; Conservative 115; Mismatches 361; Indels 178; Gaps 27 
Qy 14 LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW FKCT 7 0 



Db 



9 



i i i . i ii iii ii Mi** i 'ii' ii • i i i 

LLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCE — RPWEGPHTCP 66 



Qy 



71 



RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 13 0 



Db 



67 




Qy 



131 



GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 19 0 



Db 



127 



i • i i i i iii i i i i • i • i • i i i • i i i iii 

RGDDCSSECAPGMWGPQCDKPCSCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPA 18 6 



Qy 



Db 



191 



187 



CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 2 50 

I 1111:111! I I I I I II I : I I 1 I I 1 I I I I I 

CQFRCQC-HGAPCDPQTGACFCPAERTGPSCDVSCSQGTSGFFCPSTHPCQNGGVFQTPQ 2 45 



251 



GECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDE 310 
I I I I I I I I II : I I I I I I I I I I I I I : I I I I I II I I I I I : I I I I I : I I : : I 



Db 246 GSCSCPPGWMGTICSLPCPEGFHGPNCSQECRCHNGGLCDRFTGQCRCAPGYTGDRCREE 305 

Qy 311 CPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPC 370 

1111:1 I I I I I I : I : : I M I M I I I : I I I I I I : I I I : I i I 

Db 306 CPVGRFGQDCAETCDCAPDARCFPANGACLCEHGFTGDRCTDRLCPDGFYGLSCQAPCTC 3 65 

Qy 371 HLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTC 430 

I : : I I I I I : I I I : I I I I : I I : I I I : I : I I I : I I : I I : : I I I 

Db 366 DREHSLSCHPMNGECSCLPGWAGLHCNESCPQDTHGPGCQEYCLCLHGGVCQATSGLCQC 425 

Qy 431 APGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGT 490 

I I I : I I : : I 1 I II : I M : I I I : I I I I : I I I II II : M : I I I I 

Db 426 APGYTGPHCASLCPPDTYGVNCSARCSCENAIACSPIDGECVCKEGWQRGNCSVPCPPGT 485 

Qy 4 91 WGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGC 55 0 

I I i I I : I I I : I : I I I I I I I I hill I : I I I I I I I I : M I 
Db 486 WGFSCNASCQCAHEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCDCDHSDGC 545 

Qy 551 HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTT 610 

I I I : I III I I I I I I I I I I I I I I : I I :: I I I I I I I I I : 
Db 546 DPVHGRCQCQAGWMGARCHLSCPEGLWGVNCSNTCTCKNGGTCLPENGNCVCAPGFRGPS 605 

Qy 611 CQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNfCA 67 0 

I II I I I I I I I : I 

Db 606 CQRSCQPGRYGKR CVP 621 

Qy 671 GICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGE 730 

III: I : I : : I I I I I I I I I I I I I I I I I I I I I I : I ! II 

Db 622 — CKCANHSFCHPSNGTCYCLAGWTGPDCSQPCPPGHWGENCAQTCQCHHGGTCHPQDGS 67 9 

Qy 731 CKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCP 790 

I I I I I I : I : I I I I : I : I : I I I I I II 

Db 680 CICPLGWTGHHCLEGCPLGTFGANCSQPCQCGPGEKC HPE 719 

Qy 791 SGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALP 850 

I I I I I I I I I : I : I : I 

Db 720 TGACVCPPGHSGAPCR IG IQEPFTVMP 746 

Qy 851 AD— SY-QIGAIAGIIILVLWLFLLALFIIYRHKQKGKESSMPAVTYTPAMRWNADYT 907 

: I : i I : I I : I : I : I : II I I I I I I I I I I III: I : : : I 

Db 747 TTPVAYNSLGAVIGIAVLGSLWALVALFIGYRHWQKGKEHHHLAVAYSSG-RLDGSEYV 805 

Qy 908 ISGTLPHSNGGNANSHYFTNPSYHTLTQCATS PHVNNRDRMTVTKSKNNQLFVNLKN-VN 966 

I : I I I : : I I I I I II : I ! : : I I : I I : I : I 

Db 806 MPDVPP SYSHYYSNPSYHTLSQCSPNPPPPNK VPGPLFASLQNPER 851 

Qy 9 67 PGKRGPVG-DCTGTLPADWKH GGYLNELGAFGLDRS YMGKSL KDLGK 1012 

I I I I I I I I I I I I I I I : I : I I I I I II 

Db 852 PG--GAQGHDNHTTLPADWKHRREPPPGPLDR-GSSHLDRSYSYSYSNGPGPFYDKGLIS 908 

Qy 1013 NSEYNSSNCSLSSSENPYATIKDPPVLIPKSSECGYVEMKSP ARR 1057 

I : I I I I I I I I I I I I : I I I Ihllll II 

Db 909 EEELGASVTSL-SSENPYATIRDLPSLPGGPRESSYMEMKGPPSGSPPRQPPQFWDSQRR 967 

Qy 1058 DSPYAEINNSTSANRNWEVEPTVSWQGVFSNNGRLSQDP YDLPKNSHIP 1108 

I : : : I I I : I : : : III i I I I I I I I I 

Db 968 RQPQPQRDSGT YE-QPSPL IHDRDSVGSQPPLPPGLPPGHYDSPKNSHIP 1016 



i 



Qy 1109 CHYDLLPVRDSSSSP-KQED 1127 

I I I I I I I I I : : : I 
Db 1017 GHYDLPPVRHPPSPPLRRQD 1036 

RESULT 8 

US-10-052-648A-31 

; Sequence 31, Application US/10052648A 
; Publication No. US20040005558A1 
; GENERAL INFORMATION: 



Ar rLl L-AJN 1 . 


Anderson, 


David 


7\pdt t r BMT • 


Burgess , 


Catherine 


ADDT TPAMT ■ 

rKc r h _L I . 


Casman, 


Stacie 


APPLICANT: 


Colman, 


Steven 


APPLICANT: 


Edinger , 


Shlomit R. 


APPLICANT: 


Ellerman 


, Karen 


APPLICANT: 


Gerlach, 


Valerie 


APPLICANT: 


Gunther , 


Erik 


APPLICANT: 


Kekuda, 


Ramesh 


APPLICANT: 


MacDouga 


11, John R. 


APPLICANT: 


Mehraban 


, Fuad 


APPLICANT: 


Pattura j 


an, Meera 


APPLICANT: 


Rothenbe 


rg, Mark 


APPLICANT: 


Shimkets 


, Richard 


APPLICANT: 


Smithson 


, Glennda 


APPLICANT: 


Spytek, 


Kimberly A. 


APPLICANT: 


Stone, David J. 


APPLICANT: 


Vernet , 


Corine A.M. 


APPLICANT: 


Zerhusen 


, Bryan D. 



TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 

TITLE OF INVENTION: USING THE SAME 

FILE REFERENCE: 21402-250 (CURA-550) 

CURRENT APPLICATION NUMBER: US/ 1 0/ 0 52 , 64 8A 

CURRENT FILING DATE: 2002-12-09 

PRIOR APPLICATION NUMBER: 60/262,454 

PRIOR FILING DATE: 2001-01-18 

PRIOR APPLICATION NUMBER: 60/272,920 

PRIOR FILING DATE: 2001-03-02 

PRIOR APPLICATION NUMBER: 60/284,549 

PRIOR FILING DATE: 2001-04-18 

PRIOR APPLICATION NUMBER: 60/303,229 

PRIOR FILING DATE: 2001-07-05 

PRIOR APPLICATION NUMBER: 60/262,892 

PRIOR FILING DATE: 2001-01-19 

PRIOR APPLICATION NUMBER: 60/263,605 

PRIOR FILING DATE: 2001-01-23 

PRIOR APPLICATION NUMBER: 60/269,098 

PRIOR FILING DATE: 2001-02-15 

PRIOR APPLICATION NUMBER: 60/264,159 

PRIOR FILING DATE: 2001-01-25 

PRIOR APPLICATION NUMBER: 60/265,517 

PRIOR FILING DATE: 2001-01-31 

PRIOR APPLICATION NUMBER: 60/271,855 

PRIOR FILING DATE: 2001-02-27 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS: 97 



SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 31 
LENGTH: 1034 
TYPE: PRT 

ORGANISM: Mus musculus 
US-10-052-648A-31 

Query Match 39.6%; Score 2668; DB 15; Length 1034; 

Best Local Similarity 42.7%; Pred. No. 1.5e-168; 

Matches 493; Conservative 110; Mismatches 379; Indels 172; Gaps 16; 

Qy 14 LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW FKCT 70 

III : I I I I I M : i I i : : I : I I : 1 I : II I I 

Db 7 LLLALGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRPFSLLPAESCH — RPWEDPHTCA 64 

Qy 71 RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 13 0 

: 1 I I! I I I I : I I I I : I I I I I I I I : I II I I I : I I I I I I I I I 
Db 65 QPTWYRTVYRQWKMDSRPRLQCCRGYYESRGACVPLCAQECVHGRCVAPNQCQCAPGW 124 

Qy 131 GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 190 

I : I I I I I I I I | | | : | : | : | | | | : | : | I I I I 

Db 125 RGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGACFCPSGLQPPNCLQPCPAGHYGPA 184 

Qy 191 CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 250 

I III I I : I I I I I I I I I I I I I I : I I I I I I I 
Db 185 CQFDCQCY-GASCDPQDGACFCPPGRAGPSCNVPCSQGTDGFFCPRTYPCQNGGVPQGSQ 243 

Qy 251 GECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDE 310 

I I I I I I I I I : 1 I I 1 I I I I I : I I I : I I I I I II I I I I I I : I I I I : I I I : I 

Db 244 GSCSCPPGWMGVICSLPCPEGFHGPNCTQECRCHNGGLCDRFTGQCHCAPGYIGDRCQEE 303 

Qy 311 CPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPC 370 

I I I I : I I I I I I I I : I : : I I I I I I I I I : II I I I I : I I I : I : I I 
Db 304 CPVGRFGQDCAETCDCAPGARCFPANGACLCEHGFTGDRCTERLCPDGRYGLSCQEPCTC 363 

Qy 371 HLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTC 430 

I : : I I I I! I I I : I : I I I : I i : I I I : I : I I I : I I : I I : : I I I 

Db 364 DPEHSLSCHPMHGECSCQPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGLCLADSGLCRC 423 

Qy 431 APGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCS PVDGSCTCKAGWHGVDCSIRCPSGT 490 

I II : I I : II I I I I I I I I II I : I I I I : I I : I I I I I : I I : I I I I 
Db 424 APGYTGPHCANLCPPDTYGINCSSRCSCENAIACSPIDGTCICKEGWQRGNCSVPCPLGT 483 

Qy 4 91 WGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGC 55 0 

I I I I I : I I I : I I : I I I I I I I I hill I : I II I I I I : I I I 

Db 484 WGFNCNASCQCAHDGVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGC 543 

Qy 551 HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTT 610 

I I I I I III I I I I I I 1 II I I I II : I :: I I I I I I I I I : 
Db 544 DPVHGQCRCQAGWMGTRCHLPCPEGFWGANCSNTCTCKNGGTCVSENGNCVCAPGFRGPS 603 

Qy 611 CQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCA 670 

I I I I I I I I I I I 

Db 604 CQRPCPPGRYGKRCVQ 619 



Qy 



671 GICTCTNN-GTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDG 72 9 
I I I I : I : I I : I I I I I I I I : I I I I I I ! I I I : I I II 



Db 


620 


— CKCNNNHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTCHPQDb 


£7 7 


Qy 


730 


ECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKC 


7 P Q 


Db 


678 


1 1 1 I 1 1 1 1 1 : 1 1 : I : I : : i 1 i 1 

SCICTPGWTGPNCLEGCPPRMFGVNCSQLCQCDLG 


1 1 o 


Qy 


790 


PSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTAL 

1 1 1 i I 1 1 1 i 1 : 1 : 1 : 1 = 


o 4 y 


Db 


713 


EMCHPQTGACVCPPGHSGADCK MGSQESFTIMPTS- 


/ 4 / 


Qy 


850 


PADSYQI GAIAGI 1 1 LVLWLFLLALFI I YRHKQKGKES SMPAVTYTPAMRWNADYTI S 

| : 1 1 : 1 1 : 1 : 1 : 1 : 1 1 1 1 II Mill III: 1 : : 1 1 : 
PVTHNSLGAVIGIAVLGTLWALIALFIGYRQWQKGKEHEHLAVAYSTG-RLDGSDYVMP 


y u y 


Db 


748 


o n c 
oUb 


Qy 


910 


GTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKNVNPGK 

| : I I 1 : : 1 1 1 1 1 1 1 : 1 1 : : 1 1 1 =1111:: 

DVSP SYSHYYSNPSYHTLSQCSPNPPPPN KVPGSQLFVSSQAPERPS 


C\ f c\ 


Db 


807 


8 53 


Qy 


970 


RGPVGDCTGTLPADWKHGGYLNELGAFGLDRSY MGKS 

1 : 1 1 1 II 1 1 1 : 1 1 1 1 1 1 1 1 'II 
RAHGRENHVTLPADWKHRREPHERGASHLDRSYSCSYSHRNGPGPFCHKGPISEEGLGAS 


1006 


Db 


854 


913 


Qy 


1007 


LKDLGKNSEYNSSNCSLSSSENPYATIKDPPVLI PKSSECGYVEMKSPARRDSPYAEINN 

: | 111111111:111 : 1 1 1 1 1 II 1 1 : : 

VMSL SSENPYATIRDLPSLPGEPRESGYVEMKGPPb VbFPKybLhL- 


"i c\ a a 

1066 


Db 


914 


QC.O 
y JO 


Qy 


1067 


STSANRNVYEVEP TVSWQGVFSNNGRLSQDP YDLPKN bnl POni 

: | : : : | I : 1 1 1 II 1 1 1 1 1 II 1 1 
--LRDRQQRQLQPQRDSGTYEQPSPLSHNEESLGSTPPLPPGLPPGQYDSPKNSHIPGHY 


1 1 1 "1 
X -L J. J. 


Db 


959 


1016 


Qy 


1112 


DLLPVRDSSSSPKQ 1125 




Db 


1017 


Mill II: 
DLPPVRHPPSPPSR 1030 





RESULT 9 

US-10-052-648A-32 

Sequence 32, Application US/10052648A 
Publication No. US2 004 0005558A1 
GENERAL INFORMATION: 
APPLICANT: Anderson, David 
APPLICANT: Burgess, Catherine 
APPLICANT : Casman, Stacie 
APPLICANT : Colman, Steven 
APPLICANT: Edinger, Shlomit R. 
APPLICANT: Ellerman, Karen 
APPLICANT: Gerlach, Valerie 
APPLICANT: Gunther, Erik 
APPLICANT : Kekuda, Ramesh 
APPLICANT: MacDougall, John R. 
APPLICANT: Mehraban, Fuad 
APPLICANT: Patturajan, Meera 
APPLICANT: Rothenberg, Mark 
APPLICANT: Shimkets, Richard 
APPLICANT: Smithson, Glennda 
APPLICANT: Spytek, Kimberly A. 
APPLICANT: Stone, David J. 



; APPLICANT: Vernet, Corine A.M. 

APPLICANT: Zerhusen, Bryan D. 
; TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
; TITLE OF INVENTION: USING THE SAME 
; FILE REFERENCE: 21402-250 (CURA-550) 
; CURRENT APPLICATION NUMBER: US/10/052, 648A 
; CURRENT FILING DATE: 2002-12-09 
; PRIOR APPLICATION NUMBER: 60/262,454 
; PRIOR FILING DATE: 2001-01-18 
; PRIOR APPLICATION NUMBER: 60/272,920 
; PRIOR FILING DATE: 2001-03-02 
; PRIOR APPLICATION NUMBER: 60/284,549 
; PRIOR FILING DATE: 2001-04-18 
; PRIOR APPLICATION NUMBER: 60/303,229 
; PRIOR FILING DATE: 2001-07-05 
; PRIOR APPLICATION NUMBER: 60/262,892 
; PRIOR FILING DATE: 2001-01-19 
; PRIOR APPLICATION NUMBER: 60/263,605 
; PRIOR FILING DATE: 2001-01-23 
; PRIOR APPLICATION NUMBER: 60/269,098 
; PRIOR FILING DATE: 2001-02-15 
; PRIOR APPLICATION NUMBER: 60/264,159 
; PRIOR FILING DATE: 2001-01-25 
; PRIOR APPLICATION NUMBER: 60/265,517 
; PRIOR FILING DATE: 2001-01-31 
; PRIOR APPLICATION NUMBER: 60/271,855 
; PRIOR FILING DATE: 2001-02-27 

Remaining Prior Application data removed - See File Wrapper or PALM. 
; NUMBER OF SEQ ID NOS : 97 
; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 32 
; LENGTH: 103 4 

TYPE: PRT 
; ORGANISM: Mus mus cuius 
US-10-052-648A-32 



Query Match 39.5%; Score 2667; DB 15; Length 1034; 

Best Local Similarity 42. 8%; Pred. No. 1.7e-168; 

Matches 494; Conservative 110; Mismatches 378; Indels 172; Gaps 17 



Qy 14 LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW FKCT 7 0 

Ml : I I I I I I I : I I I : : I : I I = I I : I I I I 

Db 7 LLLALGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRPFSLLPAESCH — RPWEDPHTCA 64 



Qy 71 RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 130 

: I I I I II I I : I I I I : I II I II I I : I I I I I I : I I I i I I I I I 

Db 65 QPTWYRTVYRQWKMDSRPRLQCCRGYYESRGACVPLCAQECVHGRCVAPNQCQCAPGW 124 

Qy 131 GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 190 

I : I I I I I I I I I I I : I : I : I I I : I : I I I I I 

Db 125 RGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGTCFCPSGLQPPNCLQPCPAGHYGPA 184 

Qy 191 CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 250 

I III I I : I ! I I I I I I I I I I I I : I I I I I I I 

Db 185 CQFDCQCY-GASCDPQDGACFCPPGRAGPSCNVPCSQGTDGFFCPRTYPCQNGGVPQGSQ 243 



QY 



2 51 GECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDE 310 



I I I I I I I I I : I I I I I I I I I : I I I : I I I I I I I I : I I I I : I I I : I 

Db 244 GSCSCPPGWMGVICSLPCPEGFHGPNCTQECRCHNGGLCDRFTGQCHCAPGYIGDRCQEE 303 

Qy 311 CPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPC 37 0 

I i I I : I I I I I I I I : I : : I I I I I I I I I : I I I I I I : I I I : I : I I 

Db 304 CPVGRFGQDCAETCDCAPGARCFPANGACLCEHGFTGDRCTERLCPDGRYGLSCQEPCTC 3 63 

Qy 371 HLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTC 430 

I : : I I I I I I I I : I : I I I : I I : I I I : I : I I I : I I : I I : : I I I 

Db 364 DPEHSLSCHPMHGECSCQPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGLCLADSGLCRC 423 

Qy 431 APGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCS IRCPSGT 490 

I I I : I I : II I I I I I I I I I I I : I I I I : I I : I I I I I : I I : I I I I 

Db 424 APGYTGPHCANLCPPDTYGINCSSRCSCENAIACSPIDGTCICKEGWQRGNCSVPCPLGT 483 

Qy 4 91 WGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGC 55 0 

I I I I 1 : I I I : I I : I I I I I I I I hill I : I II I I I h I I I 
Db 4 84 WGFNCNASCQCAHDGVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGC 54 3 

Qy 551 HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTT 610 

I Ml! Ill I I I I I I I ! I I I I I I : I :: I I I I I I I I I : 
Db 544 D P VH GQC RC Q AGWMGT RCHLPCPEGFW GAN CSNTCTCKN GGT CVS EN GN C VC AP G FRG P S 603 

Qy 611 CQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCA 67 0 

I I I I I I II I I I 

Db 604 CQRPCPPGRYGKRCVQ 619 

Qy 671 GICTCTNN-GTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDG 72 9 

I I I I : | : | I : I I II I I i I r III III I I I I : I I I I 
Db 62 0 — CKCNNNHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTCHPQDG 677 

Qy 730 ECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKC 789 

I I I I I I I I I : I I : | : | : : I I I I I II 
Db 678 SCICTPGWTGPNCLEGCPPRMFGVNCSQLCQCDLGEMC HPE 718 

Qy 790 PSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVI IVGNLNSLSRTSTAL 849 

I I I I I I ill : h I : h 

Db 719 T G AC VC P P GH S GAD C K MGSQESFTIMPTS- 747 

Qy 850 PADSYQIGAIAGIIILVLWLFLLALFIIYRHKQKGKESSMPAVTYTPAMRWNADYTIS 909 

I : | | : | | : | : I : I : II I I I I I I I I I M h h : I I : 

Db 74 8 PVTHNSLGAVIGIAVLGTLWALIALFIGYRQWQKGKEHEHLAVAYSTG-RLDGSDYVMP 8 06 

Qy 910 GTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKNVNPGK 969 

I : | | t : : I I I I I I I : I I : : I I I : II I h : 

Db 807 DVSP SYSHYYSNPSYHTLSQCSPNPPPPN KVPGSQLFVSSQAPERPS 853 

Qy 97 0 RGPVGDCTGTLPADWKHGGYLNELGAFGLDRSY MGKS 100 6 

I : I I I I I I I I : I I I I I I I I : I I 

Db 854 RAHGRENHVTLPADWKHRREPHERGASHLDRSYSCSYSHRNGPGPFCHKGPISEEGLGAS 913 

Qy 1007 LKDLGKNSEYNSSNCSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINN 1066 

: I I I I I I I I I I : I I I : I I I I I I I I I : : 

Db 914 VMSL SSENPYATIRDLPSLPGEPRESGYVEMKGPPSVSPPRQSLH- 958 

Qy 1067 STSANRNVYEVEP TVSWQGVFSNNGRLSQDP YDLPKNSHI PCHY 1111 

: I : : : I I : I I I II I I I II I I II 



Db 



959 --LRDRQQRQLQPQRDSGTYEQPSPLSHNEESLGSTPPLPPGLPPGHYDSPKNSHIPGHY 1016 



Qy 1112 DLLPVRDSSSSPKQ 1125 

I I I I I II: 
Db 1017 DLPPVRHPPSPPSR 1030 



RESULT 10 
US-10-052-648A-8 

Sequence 8, Application US/10052648A 
Publication No. US2 0 04 0005558A1 
GENERAL INFORMATION: 
APPLICANT: Anderson, David 
APPLICANT: Burgess, Catherine 
APPLICANT: Casman, Stacie 
APPLICANT : Colman, Steven 
APPLICANT : Edinger, Shlomit R. 
APPLICANT: Ellerman, Karen 
APPLICANT: Gerlach, Valerie 
APPLICANT: Gunther, Erik 
APPLICANT: Kekuda, Ramesh 
APPLICANT: MacDougall, John R. 
APPLICANT: Mehraban, Fuad 
APPLICANT: Patturajan, Meera 
APPLICANT: Rothenberg, Mark 
APPLICANT: Shimkets, Richard 
APPLICANT: Smithson, Glennda 
APPLICANT: Spytek, Kimberly A. 
APPLICANT: Stone, David J. 
APPLICANT: Vernet, Corine A.M. 
APPLICANT: Zerhusen, Bryan D. 

TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-250 (CURA-550) 
CURRENT APPLICATION NUMBER: US/ 1 0/ 0 52 , 64 8A 
CURRENT FILING DATE: 2002-12-09 
PRIOR APPLICATION NUMBER: 60/262,454 
PRIOR FILING DATE: 2001-01-18 
PRIOR APPLICATION NUMBER: 60/272,920 
PRIOR FILING DATE: 2001-03-02 
PRIOR APPLICATION NUMBER: 60/284,549 
PRIOR FILING DATE: 2001-04-18 
PRIOR APPLICATION NUMBER: 60/303,229 
PRIOR FILING DATE: 2001-07-05 
PRIOR APPLICATION NUMBER: 60/262,892 
PRIOR FILING DATE: 2001-01-19 
PRIOR APPLICATION NUMBER: 60/263,605 
PRIOR FILING DATE: 2001-01-23 
PRIOR APPLICATION NUMBER: 60/269,098 
PRIOR FILING DATE: 2001-02-15 
PRIOR APPLICATION NUMBER: 60/264,159 
PRIOR FILING DATE: 2001-01-25 
PRIOR APPLICATION NUMBER: 60/265,517 
PRIOR FILING DATE: 2001-01-31 
PRIOR APPLICATION NUMBER: 60/271,855 
PRIOR FILING DATE: 2001-02-27 

Remaining Prior Application data removed - See File Wrapper or PALM. 



; NUMBER OF SEQ ID NOS : 97 

SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 8 
; LENGTH: 1037 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-052-648A-8 

Query Match 39.4%; Score 2656; DB 15; Length 1037; 

Best Local Similarity 43.4%; Pred. No. 9.1e-168; 

Matches 503; Conservative 115; Mismatches 364; Indels 178; Gaps 27; 



Qy 


14 


LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW FKCT 

III : 1 II 1 1 1 1 1 1 1 1 : : 1 : 1 1 : II : 1 1 1 

LLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCE — RPWEGPHTCP 


70 


Db 


9 


66 


Qy 


71 


RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 

: 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 : 1 1 i 1 1 1 1 1 1 
QPTWYRTVYRQWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGW 


130 


Db 


67 


126 


Qy 


131 


GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 

1 : 1 II 1 1 1 1 1 1 1 1 : 1 : 1 : 1 1 1 : 1 : 1 1 II! 

RGDDCSSECAPGMWGPQCDKPCSCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPA 


190 


Db 


127 


186 


Qy 


191 


CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKPIGPQCEQRCPCQNGGVCHHVT 

1 1111:1111 1 1 1 1 1 II 1 : 1 1 1 1 1 1 1 1 1 1 1 
CQFRCQC-HGAPCDPQTGACFCPAERTGPSCDVSCSQGTSGFFCPSTHPCQNGGVFQTPQ 


250 


Db 


187 


245 


Qy 


251 


GECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDE 
1 1 1 1 1 1 1 1 1 1 : 1 Mill 1 111111:11111 II 1 1 1 1 I : I I I I I : 1 I : : 1 
GSCSCPPGWMGTICSLPCPEGFHGPNCSQECRCHNGGLCDRFTGQCRCAPGYTGDRCREE 


310 


Db 


246 


305 


Qy 


311 


CPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPC 

1 1 1 1 : 1 1 1 1 1 1 1 : 1 : =111111 I i 1 = 11 1 1 1 1 : 1 II : 1 1 
CPVGRFGQDCAETCDCAPDARCFPANGACLCEHGFTGDRCTDRLCPDGFYGLSCQAPRTC 


370 


Db 


306 


365 


Qy 


371 


HLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTC 

1 : : II 1 1 1 : 1 1 1 : 1 1 1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 : 1 1 : 1 1 : : 1 1 1 

DREHSLSCHPMNGECSCLPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGVCQATSGLCQC 


430 


Db 


366 


425 


Qy 


431 


APGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGT 

111:1 1 : : 1 1 1 1 1 : 1 1 1 : 1 1 1 : 1 1 1 1 : 1 1 1 I 1 1 1 : 1 1 : 1 1 1 1 

APGYTGPHCASLCPPDTYGVNCSARCSCENAIACS PIDGECVCKEGWQRGNCSVPCPPGT 


490 


Db 


426 


485 


Qy 


491 


WGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGC 

11111:111: 1 : 1 1 1 1 1 1 1 1 1 : II 1 1 : 1 1 1 1 1 1 1 1 : 1 1 1 

WGFSCNASCQCAHEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCDCDHSDGC 


550 


Db 


486 


545 


Qy 


551 


HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTT 

1 1 1 : 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 :: 1 1 1 1 1 1 1 1 1 : 

DPVHGRCQCQAGWMGARCHLSCPEGLWGVNCSNTCTCKNGGTCLPENGNCVCAPGFRGPS 


610 


Db 


546 


605 


Qy 


611 


CQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCA 

1 1 1 1 1 1 1 1 1 hi 

CQRSCQPGRYGKR CVP 


670 


Db 


606 


621 


Qy 


671 


GICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGE 


730 



I I I . I . I . ■ I I II I I I 1 I I I I I III II II I I • I I II 
Db 622 — CKCANHSFCHPSNGTCYCLAGWTGPDCSQPCPPGHWGENCAQTCQCHHGGTCHPQDGS 67 9 

Qy 731 CKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCP 7 90 

I I I I I I : I : I I I I : I : I : I M I I I I 
Db 680 CICPLGWTGHHCLEGCPLGTFGANCSQPCQCGPGEKC HPE 719 

Qy 791 SGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALP 850 

I I I I I I I I I : I : I : I 

Db 720 TGACVCPPGHSGAPCR IG IQEPFTVMP 746 

Qy 851 AD — S Y-QI GAI AGI 1 1 LVLWLFLLALFI I YRHKQKGKE S SMPAVT YTPAMRWNAD YT 907 

: I : | | : I I : | : I : I : I I I I I I I I I I I III: I : : : I 

Db 747 T T P VAYN S L GAVI G I AVL G S L WALVAL F I G YRHWQ KDKEHHH L AVAY SSG-RLDGS E YV 805 

Qy 908 ISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKN-VN 966 

: I : I I I : : I I I I I I I : I I : : I I : I I : I : I 

Db 806 MPDVPP S YSHYYSNPSYHTLSQCSPNPPPPNK VPGPLFASLQNPER 851 

Qy 967 PGKRGPVG-DCTGTLPADWKH GGYLNELGAFGLDRSYMGKSL KDLGK 1012 

I I I I I I I I I I I II I I : I : I I I I I II 

Db 852 PG--GAQGHDNHTTLPADWKHRREPPPGPLDR-GS SRLDRS YSYS YSNGPGPFYNKGLI S 908 

Qy 1013 NSEYNSSNCSLSSSENPYATIKDPPVLIPKSSECGYVEMKSP ARR 1057 

I : I 11111111111:111 I I : I I I I II 

Db 909 EEELWASVASL-SSENPYATIRDLPSLPGGPRESSYMEMKGPPSGSPPRQPPQFWDSQRR 967 

Qy 1058 DSPYAEINNSTSANRNVYEVEPTVSWQGVFSNNGRLSQDP YDLPKNSHIP 1108 

I : : : I I I : I : : : III I I I I I I I I I 

Db 968 RQPQPQRDSGT YE-QPSPL IHDRDSVGSQPPLPPGLPPGHYDS PKNSHI P 1016 

Qy 1109 CHYDLLPVRDSSSSP-KQED 1127 

I I I I II I I I : : : I 
Db 1017 GHYDLPPVRHPPSPPLRRQD 1036 



RESULT 11 
US-09-796-753-114 

; Sequence 114, Application US/09796753 

; Publication No. US20030027998A1 

; GENERAL INFORMATION: 

; APPLICANT: McCarthy, Sean A. 

TITLE OF INVENTION: SECRETED PROTEINS AND USES THEREOF 

FILE REFERENCE: 7853-227-999 
; CURRENT APPLICATION NUMBER: US/ 09/7 96, 753 
; CURRENT FILING DATE: 2001-03-01 
; PRIOR APPLICATION NUMBER: 09/183,175 
; PRIOR FILING DATE: 1998-10-30 
; PRIOR APPLICATION NUMBER: 09/223,094 

PRIOR FILING DATE: 1998-12-30 
; PRIOR APPLICATION NUMBER: 09/223,546 
; PRIOR FILING DATE: 1998-12-30 

PRIOR APPLICATION NUMBER: 09/224,246 

PRIOR FILING DATE: 1998-12-30 

PRIOR APPLICATION NUMBER: 09/259,388 

PRIOR FILING DATE: 1999-02-26 
; PRIOR APPLICATION NUMBER: 60/122,458 



; PRIOR FILING DATE : 1999-03-01 

PRIOR APPLICATION NUMBER: 09/312,359 

; PRIOR FILING DATE: 1999-05-14 

; PRIOR APPLICATION NUMBER: 09/336,536 

; PRIOR FILING DATE: 1999-06-18 

; PRIOR APPLICATION NUMBER: 09/342,687 

; PRIOR FILING DATE: 1999-06-29 

; PRIOR APPLICATION NUMBER: 09/345,464 

PRIOR FILING DATE: 1999-06-30 

; PRIOR APPLICATION NUMBER: 09/365,164 

; PRIOR FILING DATE: 1999-07-30 

; PRIOR APPLICATION NUMBER: 09/399,723 

; PRIOR FILING DATE: 1999-09-20 

; PRIOR APPLICATION NUMBER: 09/409,634 

; PRIOR FILING DATE: 1999-09-30 

; PRIOR APPLICATION NUMBER: 09/471,179 

; PRIOR FILING DATE: 1999-12-23 

; PRIOR APPLICATION NUMBER: 09/474,071 

; PRIOR FILING DATE: 1999-12-29 

; PRIOR APPLICATION NUMBER: 09/474,072 

; PRIOR FILING DATE: 1999-12-29 

; PRIOR APPLICATION NUMBER: 09/514,010 

; PRIOR FILING DATE: 2000-02-25 

; PRIOR APPLICATION NUMBER: 09/516,745 

; PRIOR FILING DATE: 2000-03-01 

; PRIOR APPLICATION NUMBER: 09/572,002 

; PRIOR FILING DATE: 2000-05-14 

; PRIOR APPLICATION NUMBER: 09/597,993 
PRIOR FILING DATE: 2000-06-19 
PRIOR APPLICATION NUMBER: 09/599,596 

; PRIOR FILING DATE: 2000-06-22 

; PRIOR APPLICATION NUMBER: 09/630,334 

; PRIOR FILING DATE: 2000-07-31 

; PRIOR APPLICATION NUMBER: 09/606,565 

; PRIOR FILING DATE: 2000-06-29 

; PRIOR APPLICATION NUMBER: 09/606,317 

; PRIOR FILING DATE: 2000-06-29 

PRIOR APPLICATION NUMBER: 09/665,666 

; PRIOR FILING DATE: 2000-09-20 

; PRIOR APPLICATION NUMBER: 09/677,751 

; PRIOR FILING DATE: 2000-09-30 

; NUMBER OF SEQ ID NOS : 162 

; SEQ ID NO 114 

; LENGTH: 105 0 

TYPE: PRT 

; ORGANISM: Homo sapiens 
US-09-796-753-114 



Query Match 37.2%; Score 2506.5; DB 10; Length 1050; 

Best Local Similarity 40.5%; Pred. No. 7.7e-158; 

Matches 490; Conservative 111; Mismatches 345; Indels 263; Gaps 3 

14 LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW 66 

Ml : I II | | | | I M I : : I : I I : II : I I 

) 9 LLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCE — RPWEGPHTCP 66 



Qy 



67 



FKCTRHRVSYR- 



TAY 8 0 



I I :i 

Db 67 S PQTQRKLLAS RD S FCMVCVGAGVQWRDRS ALQPQTGNALSMRPQPRVLS GAP S LAS P GH 12 6 

Qy 81 RHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGWGGTNCSSA-- 138 

|| : | : : Ml MM III I I MINIM Ml Ml I Mill 

Db 127 TVWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGWRGDDCSSAPN 186 
Qy 139 CDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQR 194 

! : M I I M I I M I I M M II I I I M I II I 

Db 187 CLQPCTPGYYGPACQFRCQC-HGAPCDPQTGACFCPAERTGPSCDVSCSQGT 237 

Qy 195 CQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGECS 254 

M M I I lllllll I II 

Db 238 SGFFC PSTH PCQNGGVFQTPQGSCS 262 

Qy 255 CPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVG 314 

||: I I I II I II M I I I I I M II II M I M M II II 

Db 2 63 CPPGWMGTICSLPCPEGFHGPNCSQECRCHNGGLCDRFTGQCRCAPGYTGDRCREECPVG 322 

Qy 315 TYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLEN 374 

: | I I I I I I : I : M I I I I I I I M II II II M I I : I II M 

Db 323 RFGQDCAETCDCAPDARCFPANGACLCEHGFTGDRCTDRLCPDGFYGLSCQAPCTCDREH 382 

Qy 375 THSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGF 434 

: | I II I M I M I II I M I M I M I M I M I I M I : M I I I I M 
Db 383 SLSCHPMNGECSCLPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGVCQATSGLCQCAPGY 442 

Qy 435 KGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFG 494 

I | : : I I I I I M I M I I I M I I M II I II I I M M I I M I M 

Db 443 TGPHCASLCPPDTYGVNCSARCSCENAIACSPIDGECVCKEGWQRGNCSVPCPPGTWGFS 502 

Qy 495 CNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTT 554 

II Mil • M I II I II I I Mill I M II II II M M I I 

Db 503 CNASCQC7\HEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCDCDHSDGCDPVH 5 62 

Qy 555 GHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRI 614 

MM III I I II II I I I I I I II M M M I II II I 11 M II 

Db 5 63 GRCQCQAGWMGARCHLSCPEGLWGVNCSNTCTCKNGGTCLPENGNCVCAPGFRGPSCQRS 622 

Qy 615 CSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICT 674 

I I I I I I hi I 
Db 623 CQPGRYGKR CVP CK 636 

Qy 67 5 CTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCT 7 34 

||: I : I : : t I I I I II I II I I I I M I I MUM I I I I I 

Db 637 C7\NHSFCHPSNGTCYCLAGWTGPDCSQPCPPGHWGENCAQTCQCHHGGTCHPQDGSCICP 69 6 

Qy 735 PGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTY 7 94 

I I I I M : I I I i M M : II I I I M 
Db 697 LGWTGHHCLEGCPLGTFGANCSQPCQCGPGEKC HPE 732 

Qy 795 GYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALPAD — 852 

I I I I II M I M : I : I 

D b 733 TGACVCPPGHSGAPCR IG IQEPFTVMPTTPV 763 

Qy 853 SY-QIGAIAGIIILVLWLFLLALFIIYRHKQKGKESSMPAVTYTPAMRWNADYTISGT 911 

: I M M II M M : I M I II I M II I II MM M : M = 



Db 



7 64 AYNSLGAVIGIAVLGSLWALVALFIGYRHWQKGKEHHHLAVAYSSG-RLDGSEYVMPDV 822 



Qy 912 LPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKN-VNPGKR 970 

I : I I I : : I I I I I I I : I I : : I I : I I : I : I II 

Db 823 PP SYSHYYSNPSYHTLSQCSPNPPPPNK VPGPLFASLQNPERPG-- 866 

Qy 971 GPVG-DCTGTLPADWKH GGYLNELGAFGLDRS YMGKSL KDLGKNSEY 1016 

III MINIM I h I: Mill II I 

Db 867 GAQGHDNHTTLPADWKHRREPPPGPLDR-GSSRLDRS YSYSYSNGPGPFYDKGLISEEEL 925 

Qy 1017 NSSNCSLSSSENPYATIKDPPVLIPKSSECGYVEMKS PARRDSPYAEINNSTSANRNVYE 107 6 

: I MM Mill I MINI : I I I 

Db 926 GASVASL-SSENPYATIRDLPSLPGGPRESSYMEMKGPPSGSAPRQPPQFWDSQRRR 981 

Qy 1077 VEPTVSWQGVFSNNGRL SQDP YDLPKNSHI PCHYDLLPVRDS 1119 

: | I : I 111 I I M I I I II I I I I I II 

Db 982 -QPQPQRDSGTYEQPSPLIHDRDSVGSQPPLPPGLPPGHYDSPKNSHIPGHYDLPPVRHP 1040 

Qy 1120 SSSP-KQED 1127 

I I : : : I 
Db 1041 PSPPLRRQD 1049 



RESULT 12 
US-10-052-648A-4 

; Sequence 4, Application US/10052648A 
; Publication No. US20040005558A1 
; GENERAL INFORMATION: 



APPLICANT: 


Anderson, 


David 


APPLICANT: 


Burgess, 


Catherine 


APPLICANT: 


Gasman, 


Stacie 


APPLICANT : 


Colman, 


Steven 


APPLICANT: 


Edinger , 


Shlomit R. 


APPLICANT: 


Ellerman 


, Karen 


APPLICANT: 


Gerlach, 


Valerie 


APPLICANT: 


Gunther , 


Erik 


APPLICANT: 


Kekuda, 


Raines h 


APPLICANT: 


MacDouga 


11, John R. 


APPLICANT: 


Mehraban 


, Fuad 


APPLICANT: 


Pattura j 


an, Meera 


APPLICANT: 


Rothenbe 


rg, Mark 


APPLICANT: 


Shimkets 


, Richard 


APPLICANT: 


Smithson 


, Glennda 


APPLICANT: 


Spytek, 


Kimberly A. 


APPLICANT: 


Stone, D 


avid J. 


APPLICANT: 


Vernet , 


Corine A.M. 


APPLICANT: 


Zerhusen 


, Bryan D. 



TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 

TITLE OF INVENTION: USING THE SAME 

FILE REFERENCE: 21402-250 (CURA-550) 

CURRENT APPLICATION NUMBER: US/ 10/ 052 , 64 8A 

CURRENT FILING DATE: 2002-12-09 

PRIOR APPLICATION NUMBER: 60/262,454 

PRIOR FILING DATE: 2001-01-18 

PRIOR APPLICATION NUMBER: 60/272,920 

PRIOR FILING DATE: 2001-03-02 

PRIOR APPLICATION NUMBER: 60/284,549 



PRIOR FILING DATE: 2001-04-18 
PRIOR APPLICATION NUMBER: 60/303,229 
PRIOR FILING DATE: 2001-07-05 
PRIOR APPLICATION NUMBER: 60/262,892 
PRIOR FILING DATE: 2001-01-19 
PRIOR APPLICATION NUMBER: 60/263,605 
PRIOR FILING DATE: 2001-01-23 
PRIOR APPLICATION NUMBER: 60/269,098 
PRIOR FILING DATE: 2001-02-15 
PRIOR APPLICATION NUMBER: 60/264,159 
PRIOR FILING DATE: 2001-01-25 
PRIOR APPLICATION NUMBER: 60/265,517 
PRIOR FILING DATE: 2001-01-31 
PRIOR APPLICATION NUMBER: 60/271,855 
PRIOR FILING DATE: 2001-02-27 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 97 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 4 
LENGTH: 92 8 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-052-648A-4 

Query Match 36.7%; Score 2472.5; DB 15; Length 928; 

Best" Local Similarity 43.8%; Pred. No. 1.2e-155; 

Matches 441; Conservative 100; Mismatches 313; Indels 153; Gaps 18; 

Qy 14 LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW FKCT 7 0 

Ml : I II I I I I I II I : : I : I I : II : I I I 

Db 9 LLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCE— RPWEGPHTCP 66 

Qy 71 RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 130 

: | | | | I I II : I : : III I I I I I III II : I I I I I I : I I I I I I I I I 

Db 67 QPTWYRTVYRQWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGW 12 6 

Qy 131 GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 190 

I : M I I I I I I | | | : I : I : I I I : I : I I III 

Db 127 RGDDCSSECAPGMWGPQCDKPCSCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPA 18 6 

Qy 191 CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 250 

I I I I I : I I I I II I I I II I : I I I I I I I I I I 

Db 187 CQFRCQC-HGAPCDPQTGACFCPAERTGPSCDVSCSQGTSGFFCPSTHSCQNGGVFQTPQ 245 



Qy 



251 GECSCPSGWM GTVCGQPCPEGRFGKNCSQECQCH 284 

| | | M | | | I I : I I I I I I I I I I II I : I I 

Db 246 GSCSCPPGWMVWRVGPVGMGCGSGENSVGGAKQGSKGTICSLPCPEGFHGPNCSQECRCH 305 



Q y 285 NGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAG 344 

I I I II I I I I I : I I I I I : I I : : M I I I : I I I I I I I : I : : I I I I I I I 

Db 306 NGGLCDRFTGQCRCAPGYTGDRCREECPVGRFGQDCAETCDCAPDARCFPANGACLCEHG 3 65 

Qy 345 FAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGF 404 

| | : | | | | | | : | I I : I I I : : I I I I |: I I I : I I I I : I I : I I h I 

Db 366 FTGDRCTDRLCPDGFYGLSCQAPRTCDREHSLSCHPMNGECSCLPGWAGLHCNESCPQDT 425 

Qy 4 05 YGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVC 4 64 



: I | | : | | : | I : : I I i I I I : I I : : II M I : I I I : I I I : I I 

Db 426 HGPGCQEHCLCLHGGVCQATSGLCQCAPGYTGPHCASLCPPDTYGVNCSARCSCENAIAC 485 

Qy 4 65 SPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEK 52 4 

||: M : I I : I I I I I I I I I : I I I : I = 

Db 486 SPIDGECVCKEGWQRGNCSVPCPPGTWGFSCNASCQCAHEAVCSPQTGACTCTPGWHGAH 545 

Qy 525 CELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLP 584 

1:111 I : : I : I I I > I I I 

Db 54 6 CQLPCPKGQFGEGCASRCDCDHSDGCDPVHGRCQCQAGWMGARCHLSCPEGLWGVNCSNT 605 

Qy 585 CYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGL 644 

I I I I I : I I :: I I I I I I I I I : I I I I I I I I I 

Db 606 CTCKNGGTCLPENGNCVCAPGFRGPSCQRSCQPGRYGKR 644 

q y 645 CDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCP 7 04 

|:| I I I : I : I :: I I II I I I I I I I I 

Db 645 — CVP CKCANHSFCHPSNGACYCLAGWTGPDCSQPCP 679 

Qy 7 05 PAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNG 7 64 

| | I I M | I I I : I I I I I I II I I : I : I I I I : I : I : III I 

Db 68 0 PGHWGENCAQTCQCHHGGTCHPQDGSCICPLGWTGHHCLEGCPLGTFGAMCSQPCQCGPG 7 39 

Qy 765 ADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWK 824 

I I I M I I II 

Db 740 EKC HPE TGACVCPPGHS 756 

Qy 825 GARCDQAGVIIVGNLNSLSRTSTALPAD — S Y-QIGAIAGIIILVLWLFLLALFI I YRH 881 

Ml : | : | : I : I : I I : I I : I : I : I : I I I I I I I 

Db 757 GAPCR IG 1 Q E P FT VM P T T P VAYN S L GAVI G I AVL G S LWALVAL F I G Y RH 806 

Q y 882 KQKGKESSMPAVTYTPAMRVVNADYTI SGTLPHSNGGNANSHYFTNPSYHTLTQCATSPH 941 

|| | i Ml: I : : : I : I : I I I : : M : M : : I 

Db 807 WQKDKEHHHLAVAYSSG-RLDGSEYVMPDVPP SYSHYYSNPSYHTLSQCSPNPP 859 

Qy 942 VNNRDRMTVTKSKNNQLFVNLKN-VNPGKRGPVG-DCTGTLPADWKH 986 

I : I I : I : I I I I I I I I 

Db 8 60 PPNK VPGPLFASLQNPERPG--GAQGHDNHTTLPADWKH 8 96 



RESULT 13 
US-10-052-648A-6 

; Sequence 6, Application US/10052648A 
; Publication No. US2 004 0005558A1 
; GENERAL INFORMATION: 



APPLICANT: 


Anderson, 


David 


APPLICANT: 


Burgess, 


Catherine 


APPLICANT : 


Casman, 


Stacie 


APPLICANT: 


Colman, 


Steven 


APPLICANT 


Edinger , 


Shlomit R 


APPLICANT 


Ellerrnan 


, Karen 


APPLICANT 


Gerlach, 


Valerie 


APPLICANT 


Gunther , 


Erik 


APPLICANT 


Kekuda, 


Rarnesh 


APPLICANT 


MacDougall, John R 


APPLICANT 


: Mehraban 


, Fuad 


APPLICANT 


: Patturaj 


an, Meera 



APPLICANT: Rothenberg, Mark 
APPLICANT: Shimkets, Richard 
APPLICANT: Smithson, Glennda 
APPLICANT: Spytek, Kimberly A. 
APPLICANT: Stone, David J. 
APPLICANT: Vernet, Corine A.M. 
APPLICANT: Zerhusen, Bryan D. 

TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-250 (CURA-550) 
CURRENT APPLICATION NUMBER: US/ 10/ 052 , 648A 
CURRENT FILING DATE: 2002-12-09 
PRIOR APPLICATION NUMBER: 60/262,454 
PRIOR FILING DATE: 2001-01-18 
PRIOR APPLICATION NUMBER: 60/272,920 
PRIOR FILING DATE: 2001-03-02 
PRIOR APPLICATION NUMBER: 60/284,549 
PRIOR FILING DATE: 2001-04-18 
PRIOR APPLICATION NUMBER: 60/303,229 
PRIOR FILING DATE: 2001-07-05 
PRIOR APPLICATION NUMBER: 60/262,892 
PRIOR FILING DATE: 2001-01-19 
PRIOR APPLICATION NUMBER: 60/263,605 
PRIOR FILING DATE: 2001-01-23 
PRIOR APPLICATION NUMBER: 60/269,098 
PRIOR FILING DATE: 2001-02-15 
PRIOR APPLICATION NUMBER: 60/264,159 
PRIOR FILING DATE: 2001-01-25 
PRIOR APPLICATION NUMBER: 60/265,517 
PRIOR FILING DATE: 2001-01-31 
PRIOR APPLICATION NUMBER: 60/271,855 
PRIOR FILING DATE: 2001-02-27 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 97 
SOFTWARE: PatentlnVer. 2.1 
SEQ ID NO 6 
LENGTH: 92 8 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-052-648A-6 

Query Match 36.7%; Score 2472.5; DB 15; Length 928; 

Best Local Similarity 43.8%; Pred. No. 1.2e-155; 

Matches 441; Conservative 100; Mismatches 313; Indels 153; Gaps 18 

Qy 14 LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW FKCT 7 0 

III : I II I I I I I I I I : : I : I I : I I : I I I 

D b 9 LLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCE-- RPWEGPHTCP 66 

Qy 71 RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 130 

: | I I I I I I I : I : : I I I I I I I I I I I I I : I I I II I : I I I I I I I I I 

Db 67 QPTWYRTVYRQWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGW 12 6 

Qy 131 GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 190 

| : | | | I I I I I I I I : I : I : I I I : I : I I I I I 

Db 127 RGDDCSSECAPGMWGPQCDKPCSCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPA 186 



Qy 191 CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 2 50 

I I I I I : IN I I I : I I I 

Db 187 CQFRCQC-HGAPCDPQTGACFCPAERTGPSCDVSCSQGTSGFFCPSTHSCQNGGVFQTPQ 245 



Qy 



251 GECSCPSGWM — GTVCGQPCPEGRFGKNCSQECQCH 284 

| | | | | | | | I I : I Mill I I I I I I I : I I 

Db 246 GSCSCPPGWMVWRVGPVGMGCGSGENSVGGAKQGSKGTICSLPCPEGFHGPNCSQECRCH 305 



Qy 285 NGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAG 344 

| I I M I I I I I : I I I I I : I I : : I I i I I : I I I I I I I : I : : I I I I I I I 

Db 306 NGGLCDRFTGQCRCAPGYTGDRCREECPVGRFGQDCAETCDCAPDARCFPANGACLCEHG 3 65 

Qy 345 FAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGF 404 

| | : | | | | 1 I : I I I : I I I : : I I I I I : I I I : I I I I : I I : I I I = I 

Db 366 FTGDRCTDRLCPDGFYGLSCQAPRTCDREHSLSCHPMNGECSCLPGWAGLHCNESCPQDT 425 

Qy 405 YGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINC5SRCGCKNDAVC 464 

: | I I : I I : I I : : I I I I I I : I I : : I I I I I : I I I : I I I : I I 

Db 42 6 HGPGCQERCLCLHGGVCQATSGLCQCAPGYTGPHCASLCPPDTYGVNCSARCSCENAIAC 4 85 

Qy 465 SPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEK 524 

I I : I I I I I II : I I : I I I I I I I I I : I I I : I : I I I I I I I I 

Db 486 SPIDGECVCKEGWQRGNCSVPCPPGTWGFSCNASCQCAHEAVCSPQTGACTCTPGWHGAH 54 5 

Q y 52 5 CELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLP 584 

I : M I I : I 1111111:1111 I hi I I I I I I I I I I I I 
Db 54 6 CQLPCPKGQFGEGCAS RCDCDHSDGCDPVHGRCQCQAGWMGARCHLSCPEGLWGVNCSNT 605 

Qy 585 CYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGL 644 

I I I I I : I I : : I I I I I I I I I : I I I I II I I I 

Db 606 CTCKNGGTCLPENGNCVCAPGFRGPSCQRSCQPGRYGKR 644 

Qy 645 CDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCP 704 

| : | Ml: | : | : : I I I I I I I I I I I I 

Db 645 — CVP CKCANHSFCHPSNGACYCLAGWTGPDCSQPCP 679 

Qy 705 PAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNG 764 

I I I I I I II I I : I I I I I I I I I I : I : I I I I : I : I : Ml I 

Db 68 0 PGHWGENCAQTCQCHHGGTCHPQDGSCICPLGWTGHHCLEGCPLGTFGANCSQPCQCGPG 7 39 

Qy 765 ADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWK 824 

I I I I 

Db 740 EKC HPE TGACVCPPGHS 756 

Qy 825 GARCDQAGVI I VGNLN S LS RT STALPAD- -S Y-QI GAI AGI 1 1 LVLWLFLLALFI I YRH 881 

Ml : | : I : I : | : I I : I I : I : I : I : I I I I I M 

D b 757 GAPCR IG 1 Q E P FT VMP T T P VAYN S L GAVI G I AVL G S LWALVAL F I G YRH 806 

Qy 882 KQKGKESSMPAVTYTPAMRWNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPH 941 

I I I I |||: I : : : I = I : I I I : : I : I I : = I 

Db 807 WQKDKEHHHLAVAYSSG-RLDGSEYVMPDVPP SYSHYYSNPSYHTLSQCS PNPP 859 

Qy 942 VNNRDRMTVTKSKNNQLFVNLKN-VNPGKRGPVG-DCTGTLPADWKH 986 

| : | 1 : I : I II I II I I I I I II I 

Db 860 PPNK VPGPLFASLQNPERPG--GAQGHDNHTTLPADWKH 896 



RESULT 14 
US-10-052-648A-2 

; Sequence 2, Application US/10052648A 
; Publication No. US20040005558A1 
; GENERAL INFORMATION: 
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IxClILLvZ. oil 


APPLICANT : 


MacDouga 


11, John R. 


APPLICANT: 


Mehraban 


, Fuad 


APPLICANT : 


Pattura j 


an, Meera 


APPLICANT: 


Rothenbe 


rg, Mark 


APPLICANT: 


Shimkets 


, Richard 


APPLICANT: 


Smithson 


, Glennda 


APPLICANT: 


Spytek, 


Kiraberly A. 


APPLICANT: 


Stone, D 


avid J. 


APPLICANT: 


Vernet , 


Corine A.M. 


APPLICANT: 


Zerhusen 


, Bryan D. 



; TITLE OF INVENTION: PROTEINS , POLYNUCLEOTIDES ENCODING THEM AND METHODS 

; TITLE OF INVENTION: USING THE SAME 

; FILE REFERENCE: 21402-250 (CURA-550) 

; CURRENT APPLICATION NUMBER: US/10/052 , 64 8A 

; CURRENT FILING DATE: 2002-12-09 

; PRIOR APPLICATION NUMBER: 60/262,454 

; PRIOR FILING DATE: 2001-01-18 

; PRIOR APPLICATION NUMBER: 60/272,920 

; PRIOR FILING DATE: 2001-03-02 

; PRIOR APPLICATION NUMBER: 60/284,549 

; PRIOR FILING DATE: 2001-04-18 

; PRIOR APPLICATION NUMBER: 60/303,229 

; PRIOR FILING DATE: 2001-07-05 

; PRIOR APPLICATION NUMBER: 60/262,892 

; PRIOR FILING DATE: 2001-01-19 

; PRIOR APPLICATION NUMBER: 60/263,605 

; PRIOR FILING DATE: 2001-01-23 

; PRIOR APPLICATION NUMBER: 60/269,098 

; PRIOR FILING DATE: 2001-02-15 

; PRIOR APPLICATION NUMBER: 60/264,159 

; PRIOR FILING DATE: 2001-01-25 

; PRIOR APPLICATION NUMBER: 60/265,517 

; PRIOR FILING DATE: 2001-01-31 

; PRIOR APPLICATION NUMBER: 60/271,855 

; PRIOR FILING DATE: 2001-02-27 

; Remaining Prior Application data removed - See File Wrapper or PALM. 

; NUMBER OF SEQ ID NOS : 97 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 2 

; LENGTH: 1020 

; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-052-648A-2 



Query Match 36.4%; Score 2451.5; DB 15; Length 1020; 

Best Local Similarity 40.6%; Pred. No. 3.3e-l54; 

Matches 481; Conservative 110; Mismatches 348; Indels 247; Gaps 



28; 



Qy 


14 


LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW— FKCT 

Ml : 1 II 1 1 1 i 1 1 1 1 : : 1 : 1 1 : 1 1 : 1 1 1 
LLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCE — RPWEGPHTCP 


70 


Db 


9 


66 


Qy 


71 


RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 

> iii ii i i i f \ I ill III ill 

: | 1 1 1 1 1 II : 1 :: 1 1 1 1 1 1 1 1 1 1 1 1 1 : II 1 1 1 1 : 1 1 1 1 1 1 1 1 1 
QPTWYRTVYRQWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGW 


130 


Db 


67 


126 


Qy 


131 


GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 

| : M 1 1 1 1 1 1 1 1 1 : 1 : 1 : 1 1 1 : 1 : 1 1 1 1 1 

RGDDCSSECAPGMWGPQCDKPCSCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPA 


190 


Db 


127 


186 


Qy 


191 


CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 

I 1111:1111 Mill II 1 : 1 1 1 1 MINI 
CQFRCQC-HGAPCDPQTGACFCPAERTGPSCDVSCSQGTSGFFCPSTHSCQNGGVFQTPQ 


250 


Db 


187 


245 


Qy 


251 


GECSCPSGWM GTVCGQPCPEGRFGKNCSQECQCH 

ii i i r t l l i 1 1 1 1 1 1 II 

| | | | | | | | 1 1 = 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 

GSCSCPPGWMVWRVGPVGMGCGSGENSVGGAKQGSKGTICSLPCPEGFHGPNCSQECRCH 


284 


Db 


246 


305 


Qy 


285 


NGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAG 

I I I || I 1 1 1 1 : 1 1 1 1 1 : 1 1 : : 1 1 1 1 1 -1 INN 1 = 1 : M II II 1 1 
NGGLCDRFTGQCRCAPGYTGDRCREECPVGRFGQDCAETCDCAPDARCFPANGACLCEHG 


344 


Db 


306 


365 


Qy 


345 


FAGERCEARLCPEGLYGIKCDKRCPCHLENTHS CHPMSGECACKPGWSGLYCNETCSPGF 

I Mil 1 1 1 M 1 II: 1 1 i 1 : s II 1 II : 1 1 M 1 1 1 M 1 M 1 1 M 1 

FTGDRCTDRLCPDGFYGLSCQAPCTCDREHSLSCHPMNGECSCLPGWAGLHCNESCPQDT 


404 


Db 


366 


425 


Qy 


405 


YGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVC 

: | M : 1 1 M 1 : : 1 1 1 1 1 M 1 M : II 1 1 M 1 1 M M Ml 1 

HGPGCQEHCLCLHGGVCQATSGLCQCAPGYTGPHCASLCPPDTYGVNCSARCSCENAIAC 


464 


Db 


426 


485 


Qy 


465 


SPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAP GWRGEK 

I M | I 1 1 1 1 1 MM II Mill II MM : M 1 II 1 1 II 1 

SPIDGECVCKEGWQRGNCSVPCPPGTWGFSCNASCQCAHEAVCSPQTGACTCTPGWHGAH 


524 


Db 


486 


545 


Qy 


525 


CELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLP 

Mill 1 M II II 1 1 M M 1 1 1 Ml III 1 1 M II 1 II 

CQLPCPKGQFGEGCASRCDCDHSDGCDPVHGRCQCQAGWMGARCHLSCPEGLWGVNCSNT 


584 


Db 


546 


605 


Ov 


585 


CYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGL 

I 1 1 1 1 : 1 MM 1 M II 1 II MM 1 II II 1 


644 


Db 


606 


644 


Qy 


645 


CDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCP 


704 


Db 


645 


1 : 1 


647 


Qy 


705 


PAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNG 

|||:ll MM 1 1 1 1 1 M 1 II II : 1 : M III 1 
CKCANHSFCHPSNGTCYCLAGWTGPDCSQRCPLGTFGANCSQPCQCGPG 


764 


Db 


648 


696 



Qy 7 65 ADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWK 824 

I II I I I I I I 

Db 697 EKC H PE TGACVCPPGHS 713 

Qy 825 GARCDQAGVI IVGNLNSLSRTSTALPAD~-SY~QICAIAGIIILVLVVLFLLALFIIYRH 881 

Ml : | : i : I : I : I I : I I : I : I : I : I I I I I I I 

Db 714 GAPCR IG 1 Q E P FT VMP T T P VAYN S L GAVI G I AVL G S LVVAL VAL F I G YRH 763 

Qy 882 KQKGKESSMPAVTYTPAMRWNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPH 941 

| | | | | Ml: I : : : I : I : I I I : : I I I I I I I : I I : : I 

Db 764 WQKGKEHHHLAVAYSSG-RLDGSEYVMPDVPP SYSHYYSNPSYHTLSQCSPNPP 816 

Qy 942 VNN RD RMT VT K S KNN Q L FVN L - KN VN P GKRG P VG- D CT GT L P ADWKH GGYLNELG 9 94 

| : I I : I I ill! I I I I I I I I I II: I 

Db 817 ppnk VPGPLFASLQKPERPG— GAQGHDNHTTLPADWKHRREPPPGPLDR-G 865 

Qy 995 AFGLDRSYMGKSL KDLGKNSEYNSSNCSLSSSENPYATIKDPPVLI PKSSEC 104 6 

: M I I I II I : I I I I I I I I I I M : I I I I 

Db 866 SSRLDRSYSYSYSNGPGPFYNKGLISEEELGASVASL-SSENPYATIRDLPSLPGGPRES 924 

Qy 1047 GYVEMKSP ARRDS PYAEINNSTSANRNVYEVEPTVSWQGVFSNN 1091 

1:1111 | | | : :: | I I : I : 
Db 925 SYMEMKGPPSGSPPRQPPQFWDSQRRRQPQPQRDSGT YE-QPSPL IHDRD 973 

Qy 1092 GRLSQDP YDLPKNSHIPCHYDLLPVRDSSSSP-KQED 1127 

M I I I I I I I I I I I I I I I I I II::: I 

Db 974 SVGSQPPLPPGLPPGHYDSPKNSHIPGHYDLPPVRHPPSPPLRRQD 1019 



RESULT 15 
US-09-796-753-100 

; Sequence 100, Application US/09796753 

/ Publication No. US2 0030027 998A1 

; GENERAL INFORMATION: 

; APPLICANT: McCarthy, Sean A. 

; TITLE OF INVENTION: SECRETED PROTEINS AND USES THEREOF 

; FILE REFERENCE: 7853-227-999 

; CURRENT APPLICATION NUMBER: US/ 09/7 96, 7 53 

; CURRENT FILING DATE : 2001-03-01 



PRIOR 


APPLICATION 


NUMBER: 


09/183, 


175 


PRIOR 


FILING DATE: 


1998- 


10-30 




PRIOR 


APPLICATION 


NUMBER: 


09/223, 


094 


PRIOR 


FILING DATE: 


1998- 


12-30 




PRIOR 


APPLICATION 


NUMBER: 


09/223, 


546 


PRIOR 


FILING DATE 


1998- 


12-30 




PRIOR 


APPLICATION 


NUMBER: 


09/224, 


246 


PRIOR 


FILING DATE 


1998- 


12-30 




PRIOR 


APPLICATION 


NUMBER: 


09/259, 


388 


PRIOR 


FILING DATE 


1999- 


02-26 




PRIOR 


APPLICATION 


NUMBER: 


60/122, 


458 


PRIOR 


FILING DATE 


: 1999- 


03-01 




PRIOR 


APPLICATION 


NUMBER: 


09/312, 


359 


PRIOR 


FILING DATE 


: 1999- 


05-14 




PRIOR 


APPLICATION 


NUMBER: 


09/336, 


536 


PRIOR 


FILING DATE 


: 1999- 


-06-18 




PRIOR 


APPLICATION 


NUMBER: 


09/342, 


687 


PRIOR 


FILING DATE 


: 1999- 


-06-29 





; PRIOR APPLICATION NUMBER: 09/345,464 

; PRIOR FILING DATE: 1999-06-30 

; PRIOR APPLICATION NUMBER: 09/365,164 

; PRIOR FILING DATE: 1999-07-30 

; PRIOR APPLICATION NUMBER: 09/399,723 

; PRIOR FILING DATE: 1999-09-20 

PRIOR APPLICATION NUMBER: 09/409,634 
; PRIOR FILING DATE: 1999-09-30 
; PRIOR APPLICATION NUMBER: 09/471,179 
; PRIOR FILING DATE: 1999-12-23 
; PRIOR APPLICATION NUMBER: 09/474,071 
; PRIOR FILING DATE: 1999-12-29 
; PRIOR APPLICATION NUMBER: 09/474,072 
; PRIOR FILING DATE: 1999-12-29 
; PRIOR APPLICATION NUMBER: 09/514,010 
; PRIOR FILING DATE: 2000-02-25 
; PRIOR APPLICATION NUMBER: 09/516,745 
; PRIOR FILING DATE: 2000-03-01 
; PRIOR APPLICATION NUMBER: 09/572,002 
; PRIOR FILING DATE: 2000-05-14 
; PRIOR APPLICATION NUMBER: 09/597,993 
; PRIOR FILING DATE: 2000-06-19 
; PRIOR APPLICATION NUMBER: 09/599,596 
; PRIOR FILING DATE: 2000-06-22 

PRIOR APPLICATION NUMBER: 09/630,334 
; PRIOR FILING DATE: 2000-07-31 
; PRIOR APPLICATION NUMBER: 09/606,565 

PRIOR FILING DATE: 2000-06-29 

PRIOR APPLICATION NUMBER: 09/606,317 
; PRIOR FILING DATE: 2000-06-29 
; PRIOR APPLICATION NUMBER: 09/665,666 
; PRIOR FILING DATE : 2000-09-20 
; PRIOR APPLICATION NUMBER: 09/677,751 
; PRIOR FILING DATE: 2000-09-30 
; NUMBER OF SEQ ID NOS : 162 
; SEQ ID NO 100 
LENGTH: 63 6 
; TYPE: PRT 

ORGANISM: Rauttus sp . 
US-09-796-753-100 

Query Match 28.3%; Score 1909; DB 10; Length 636; 

Best Local Similarity 45.1%; Pred. No. 2e-118; 

Matches 328; Conservative 77; Mismatches 212; Indels 110; Gaps 9; 



260 



MGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVL 319 



Db 



1 



M : | I I I I I I I I : I I I • I M I I ii iiiiii'iii i ■ i i • • i i i ii -i 
MGVICSLPCPEGFHGPNCTQECRCHNGGLCDRFTGQCHCAPGYIGDRCREECPVGRFGQD 60 



Qy 



320 



CAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCH 37 9 



Db 



61 




QY 



380 



PMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDC 4 39 



Db 



121 




Qy 440 STPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTC 499 

: I I I i I I : I IIIIII:IMM : I h : I 

Db 181 ANLCPPNTYGINCSSHCSCENAIACSPVDGTCICKEGWQRGNCSVPCPPGTWGFSCNASC 240 

Qy 500 QCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRC 559 

I I : I I : I 1 I hill I : I I I Nihil I I 

Db 241 QCAHEGVCSPQTGACTCTPGWRGVHCQLPCPKGQFGEGCASVCDCDHSDGCDPVHGHCRC 300 

Qy 560 LPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGF 619 

III I I I I I I I I I | | I I I : I I I I I I I I I I I : I I I I I I 
Db 301 QAGWMGTRCHLPCPEGFWGANCSNACTCKNGGTCVPENGNCVCAPGFRGPSCQRPCPPGR 360 

Qy 62 0 YGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNG 67 9 

III hi I I I : 

Db 3 61 YGKR CVP CKCNNHS 37 4 

Qy 680 TCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTG 739 

: I : I I : I I I I I I I I : I I I I M I I I h I I I I I I I I I I I I 
Db 37 5 SCHPSDGTCSCLAGWTGPDCSESCPPGHWGLKCSQPCQCHHGATCHPQDGSCVCIPGWTG 4 34 

Qy 740 LYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCR 799 

|: : I I : | : |: : I I I I I 11 
Db 435 PNCSEGCPSRMFGVNCSQLCQCDPGEMC HPE 465 

Qy 800 QICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALPADSYQIGAI 859 

I I I I I I Ml I h I : hi : I h 

Db 466 TGACVCPPGHSGAHCK VGSQESFTIMPTS-PVIHNSLGAV 504 

Q y 860 AGIIILVLWLFLL7VLFIIYRHKQKGKESSMPAVTYTPAMRWNADYTISGTLPHSNGGN 919 

I I : I : h h I I I I I I I I I I M III: h Hi: | 
Db 505 IGIAVLGTLWALVALFIGYRHWQKGKEHEHLAVAYSTG-RLDGSDYVMPDVSP 557 

Qy 92 0 ANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKNVNPGKRGPVGDCTGT 97 9 

: I I I: : I I I I I I I : M : : I I I : I I I h : I I I 

Db 558 SYSHYYSNPSYHTLSQCSPNPPPPN KIPGSQLFVSSQASERPNRNHGRDNHAT 610 

Qy 98 0 LPADWKH 98 6 

Db 611 LPADWKH 617 



Search completed: March 26, 2004, 16:21:15 
Job time : 59.8389 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: March 26, 2004, 16:04:46 ; Search time 55.4809 Seconds 

(without alignments ) 
6483.148 Million cell updates/sec 



Title: US-10-092-390-2 
Perfect score: 6744 

Sequence: 1 MVI SLNSCLSFICLLLCHWI SSPKQEDSGGSSSNSS SSSE 1140 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 1017041 seqs, 315518202 residues 

Total number of hits satisfying chosen parameters: 1017041 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : SPTREMBL_2 5 : * 



1 




sp archea : * 


2 




sp bacteria:* 


3 




sp fungi : * 


4 




sp human : * 


5 




sp invertebrate : * 


6 




sp mammal : * 


7 




sp mhc:* 


8 




sp organelle:* 


9 




sp phage:* 


10 


sp plant:* 


11 


sp rodent:* 


12 


sp virus:* 


13 


sp vertebrate:* 


14 


sp unclassified:* 


15 


sp rvirus : * 


16 


sp bacteriap : * 


17 


sp archeap:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 
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No. Score Match Length DB ID 



Description 
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Q96kg7 homo sapien 
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9 S 
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13 
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9 
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11 


054796 
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R R 1 5 


13 
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413 5 


6 
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9 R 
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13 
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1918 


5 


Q86AS3 


Q86as3 dictyosteli 


9 Q 

O zj 


R 7 0 5 


12 


. 9 


569 


4 


Q8NHD4 


Q8nhd4 homo sapien 


40 


863 


12 


. 8 


713 


5 


Q962W9 


Q962w9 podocoryne 


41 


839 


12 


. 4 


320 


4 


Q8N780 


Q8n780 homo sapien 


42 


832.5 


12 


. 3 


752 


13 


042374 


042374 brachydanio 


43 


800 


11 


. 9 


866 


4 


Q8IXF3 


Q8ixf3 homo sapien 


44 


790.5 


11 


.7 


1214 


13 


Q90YD2 


Q90yd2 xenopus lae 


45 


782 


11 


. 6 


1254 


13 


Q9YHU2 


Q9yhu2 brachydanio 



ALIGNMENTS 



RESULT 1 
Q96KG7 

ID Q96KG7 PRELIMINARY ; 

AC Q96KG7; 

DT 01-DEC-2001 (TrEMBLrel. 
DT 01-DEC-2001 (TrEMBLrel. 
DT 01-OCT-2003 (TrEMBLrel. 



PRT; 1140 AA. 
19, Created) 

19, Last sequence update) 
25, Last annotation update) 



DE MEGF10 protein (Hypothetical protein KIAA1780). 

GN MEGF10 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9 606; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC TISSUE=Hippocampus ; 

RX MEDLINE=21245130; PubMed-11347 9 06 ; 

RA Nagase T . , Nakayama M., Nakajima D., Kikuno R., Ohara 0.; 

RT "Prediction of the coding sequences of unidentified human genes. XX. 

RT The complete sequences of 100 new cDNA clones from brain which code 

RT for large Proteins in vitro."; 

RL DNA Res. 8: 85-95 (2001) . 

DR EMBL; AB058676; BAB47409.1; 

DR GO; GO: 0005198; F: structural molecule activity; IEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR009030; Grow_f ac_recep . 

DR InterPro; IPR002049; Laminin_EGF. 

DR Pfam; PF00008; EGF; 10. 

DR PRINTS; PR00011; EGFLAMININ. 

DR SMART; SM0018 0; EGF^Lam; 6. 

DR PROSITE; PS00022; EGFJL; 17. 

DR PROSITE; PS01186; EGF_2 ; 17. 

KW Hypothetical protein; EGF-like domain; Laminin EGF-like domain. 

SQ SEQUENCE 1140 AA; 122204 MW; 45B2FA2 39423 895A CRC64; 

Query Match 100.0%; Score 6744; DB 4; Length 1140; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1140; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQI YYTSC 60 

| | M I I I II II I I I I I I I I I I I M I I I I I I I I I I I I 1 1 I I I I I I I I I I I I M I I I I M I I 

1 MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQI YYTSC 60 
61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 120 

| I I M M I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I M II I I I I I I M I I I I I I I 

61 TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 12 0 
121 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 18 0 

I | | | | M I I I I I I I I M I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I I I M M I I I 

121 PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 18 0 
181 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 24 0 

M I I I I II I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I M I I I I I I I I I I I I I 

181 RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 240 
241 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 

| | | | | | M I II I I I M I I I I I I I M I I I II II I I I I II I I I I M I I I M I II I I I I I I I I 

241 QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 300 

301 GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

| | | | | | | M | | | | | I I I II I I I I II II I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M 
301 GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 3 60 

361 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 42 0 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



I 1 1 1 1 1 1 i II i 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 

Db 361 GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 420 

Qy 421 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 4 80 

|| | | | I | | I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I 

Db 421 CDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGV 480 

Qy 481 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 540 

I | | | | | | | | | I I M I I I I I I I I I I M I I I I I M I I I I I M I I I I I I I I I I I I I I 

D b 481 DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 540 

Qy 541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 600 

I I I I I I I I I M I II I I I I I I I I I I I I I I I I II II I II I II I! II I I I I I I I I I I I I I I 

Db 541 RCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGIC 600 

Qy 601 ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 660 

II M I I I I M I I II I I I I M I I I I I I I I I I I I Ill 

Db 601 ECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVC 660 

Qy 661 PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 720 

I | | M I I I I I I I I I I I I I I I I I M I I I I M I I I I II I I I I I I M I I I I I I I I II I I I I I I 

Db 661 PSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHN 720 

Qy 721 GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGF 780 

I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I M 

Db 721 GAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHI SGQCTCRTGF 780 

Q y 781 MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 840 

I I I I I I I I I I I I II I I I I I I I I I I II I I I I I M I I M M I I I I I I II I I I I I I II I I I II 

Db 781 MGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLN 840 

Qy 841 SLSRTSTALPADSYQIGAIAGIIILVLVVLFLLALFII YRHKQKGKESSMPAVTYTPAMR 900 

I I I M I I I I M I I I I I II I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I II 

D b 841 S L S RT S T AL PAD S YQ I GAI AG III LVLWL FL LAL F 1 1 YRHKQ KGKE S SMP AVT YT P AMR 900 

Qy 901 WNAD YT I S GT L P H SN GGNAN S H Y FTN P S YHT LTQCAT S PHVNN RDRMT VT KS KNNQL FV 960 

M | | | I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II M I I I I I I I I I I I I I II I M 

Db 901 VWADYT I SGTLPHSNGGNANSHYFTN PS YHT LTQCAT SPHVNNRDRMTVTKS KNNQL FV 960 

Qy 961 NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 1020 

| | I II I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II 
Db 961 NLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSN 102 0 

Qy 1021 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 1080 

I | | | I I | I I I I I M I I I I II I I I I I I I I I II I I I I I I II I I I I I M I I I I I I I I I M I I I 

Db 1021 CSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPT 1080 

Qy 1081 VSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 

I I I I II I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I M I I M I I I I I I 

Db 1081 VSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE 1140 



RESULT 2 
Q96KG6 

ID Q96KG6 PRELIMINARY; PRT; 969 AA. 

AC Q9 6KG6; 

DT 01-DEC-2001 (TrEMBLrel. 19, Created) 



DT 01-DEO2001 (TrEMBLrel . 19, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE MEGF11 protein (Hypothetical protein KIAA1781) . 

GN MEGF11. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9 606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RX MEDLINE-21245130; PubMed=l 134 7 9 0 6 ; 

RA Nagase T. , Nakayama M. , Nakajima D., Kikuno R., Ohara 0.; 

RT "Prediction of the coding sequences of unidentified human genes. XX. 

RT The complete sequences of 100 new cDNA clones from brain which code 

RT for large Proteins in vitro."; 

RL DNA Res. 8:85-95(2001). 

DR EMBL; AB058677; BAB47410.1; 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0045285; C : ubiquinol-cytochrome-c reductase complex; IEA. 

DR GO; GO: 0005198; F: structural molecule activity; IEA. 

DR GO; GO: 0008121; F: ubiquinol-cytochrome-c reductase activity; IEA. 

DR GO; GO: 0006118; P: electron transport; IEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002049; Laminin_EGF. 

DR InterPro; IPR005805; Rieske. 

DR Pfam; PF00008; EGF; 12. 

DR PRINTS; PR00011; EGFLAMININ. 

DR SMART; SM00180; EGF_Lam; 8. 

DR PROSITE; PS00022; EGFJL; 17. 

DR PROSITE; PS01186; EGF_2 ; 17. 

DR PROSITE; PS00200; RIESKE_2; 1. 

KW Hypothetical protein; EGF-like domain; Laminin EGF-like domain. 

SQ SEQUENCE 969 AA; 101600 MW; 56DD2FFE139C8209 CRC64; 

Query Match 55.9° 5 ; Score 3769; DB 4; Length 969; 

Best Local Similarity 58.6%; Pred. No. 2.9e-292; 

Matches 600; Conservative 126; Mismatches 208; Indels 90; Gaps 

CADKCVHGRCIAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACI 

I : : | M | | I : : I : i I I I I I I I I : I II II I I I I I M : : I I II : I I I I I I I I I M I 
CTEECVHGRCVSPDTCHCEPGWGGPDCSSGCDSDHWGPHCSNRCQCQNGALCNPITGACA 

CAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPC 

1111111111:1 I I : I I I I I : : I I = I I I I I I : I I : I I I I 

CAAGFRGWRCEELCAPGTHGKGCQLPCQCRHGASCDPRAGECLCAPGYTGVYCEELCPPC 

KHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGG r 

II II I I I I 1 I I I llhlllhll II I II MM I M : I I I I : I I I : I I 
SHGAHCELRCPCQNGGTCHHITGECACPPGWTGAVCAQPCPPGTFGQNCSQDCPCHHGGC 

CDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAG] 

II I I I I I I : I I I : I I I : M I I : : I I : : I I I I I : I : I I I I I I : I 



Qy 


109 


Db 


28 


Qy 


169 


Db 


88 


Qy 


229 


Db 


148 


Qy 


289 


Db 


208 



Qy 



349 RCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEA 408 
I I : I I I I I I I : I I III : I I I I I I :: I I I : I I I I I : II I : I I : I I : 



Db 268 RCQERLCPEGLHGPGCTLPCPCDADNTISCHPVTGACTCQPGWSGHHCNESCPVGYYGDG 327 

Qy 409 CQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVD 468 

II I : I I I : I I I I I I I II I i : I I ' 

Db 328 CQLPCTCQNGADCHSITGGCTCAPGFMGEVCAVSCAAGTYGPNCSSICSCNNGGTCSPVD 387 

Qy 469 GSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELP 528 

I I I I I I II I : I I : : I I I I I I I I I = I : • I I : I : : I I I I 

Db 388 GSCTCKEGWQGLDCTLPCPSGTWGLNCNESCTCANGAACSPIDGSCSCTPGWLGDTCELP 447 

Q y 529 CQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCK 58 8 

I | I I : I I I I : I I I I I I I I II I I I I I II I I : I : I I I I I I I = I I : 

Db 448 CPDGTFGLNCSEHCDCSHADGCDPVTGHCCCLAGWTGIRCDSTCPPGRWGPNCSVSCSCE 507 

Qy 589 NGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCL 64 8 

|| | I I I : I I I II I I I I I I M I I I I I I I I I I : I M I I I I I I M II : I : I : I I 

Db 508 NGGSCSPEDGSCECAPGFRGPLCQRICPPGFYGHGCAQPCPLCVHSSRPCHHISGICECL 567 

Qy 64 9 PGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHW 7 08 

| | | : | | M | : | | | I I :: I I : I = I M I I I : I II I I II : I I I II I M I I I I I 
Db 568 PGFSGALCNQVCAGGYFGQDCAQLCSCANNGTCSPIDGSCQCFPGWIGKDCSQACPPGFW 627 

Q y 709 GPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCD 7 68 

| | | | | : I I I I I I I I I I I I I I I I I II : II I I I I 1:1111 : I I I I I I I I I 
Db 628 GPACFHACSCHNGASCSAEDGACHCTPGWTGLFCTQRCPAAFFGKDCGRVCQCQNGASCD 687 

Q y 769 HISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARC 828 

| | | | : | | | | | II I : I I I I : I I I : I I I I : I : I : I : I I I I I I I : I I I I : I I I I 

Db 688 HI SGKCTCRTGFTGQHCEQRCAPGTFGYGCQQLCECMNNSTCDHVTGTCYCSPGFKGIRC 747 

Qy 829 DQAGVIIVGNLNSLSRTSTALPADSYQIGAIAGIIILVLWLFLLALFII YRHKQKGKES 888 

Ml : : : II r = I II I = - =11= I I = = I = : : I I I I : I : I I I 

Db 7 48 DQA-ALMMEELNPYTKISPALGAERHSVGAVTGIMLLLFFIWLLGLFAWHRRRQKEKGR 8 06 

Qy 889 SM-PAVTYTPAMRWNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDR 947 

: I I : I I I I I I : : I I : : I 

Db 8 07 DLAPRVSYTPAMRMTSTDYSLS 82 8 

q v 948 MTVTKSKNNQLFVNLKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRS YM 1003 

111:111 

Db 8 2 9 GACGMDRRQNTYIM 8 42 

Qy 1004 GKSLKDLGKNSEYNSSNCSLSSSENPYATIKDPPVLI PKSSECGYVEMKSPARRDS PYAE 1063 

| || || : I I I I I : I II I I I I I I II I I : I I I I I I I I I I III: 
Db 843 DKGFKDYMKESVCSSSTCSLNSSENPYATIKDPPILTCKLPESSYVEMKSPVHMGSPYTD 902 

Qy 1064 INNSTSANRNVYEVEPTVSWQGVFSNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSP 1123 

: : : : : | : | : | | I I I I I I M I : I I : I I I I : I I I I I I I I I I I I I I 

Db 903 VPSLSTSNKNIYEVEPTVSWQEGCGHNSSYIQNAYDLPRNSHIPGHYDLLPVRQSPANG 962 

Qy 1124 KQED 1127 

: I 

Db 963 PSQD 966 



RESULT 3 
Q8BKK7 



ID Q8BKK7 PRELIMINARY; PRT; 947 AA. 

AC Q8BKK7; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel . 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE MEGF11 protein. 

GN 2410080H04RIK. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10 0 90; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TI SSUE^Dorsal root ganglion; 

RX MEDLINE-22354683; PubMed=12 4 668 5 1 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60, 77 0 full-length cDNAs . 11 ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK051642; BAC34702.1; 

DR MGD; MGI:1920951; 2 4 1008 0H04Rik . 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0045285; C : ubiquinol-cytochrome-c reductase complex; IEA. 

DR GO; GO: 0005198; F: structural molecule activity; IEA. 

DR GO; GO: 0008121; F : ubiquinol-cytochrome-c reductase activity; IEA. 

DR GO; GO:0006118; P:electron transport; IEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR002049; Laminin_EGF. 

DR InterPro; IPR005805; Rieske. 

DR Pfam; PF00008; EGF; 11. 

DR PRINTS; PR00011; EGFLAMININ . 

DR SMART; SM00181; EGF; 15. 

DR SMART; SM0 018 0; EGF_Lam; 15. 

DR PROSITE; PS00022; EGFJL; 15. 

DR PROSITE; PS01186; EGF__2 ; 15. 

DR PROSITE; PS00200; RIESKE_2; 1. 

SQ SEQUENCE 947 AA; 100661 MW; 0C2 09B1 1DFEE8 314 CRC64; 

Query Match 52.0%; Score 3509.5; DB 11; Length 947; 

Best Local Similarity 56.1%; Pred. No. 1.5e-271; 

Matches 577; Conservative 122; Mismatches 217; Indels 113; Gaps 13 

I S LLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNWFKCTRHRV 74 
|| : | | | | | | | | | M II I I : I I I I I I I I I M I 11 M I I I I M I I I M I I : 



I I : I I I I I : I I I I I : I I I I M : I I : I : I = 

ovyfpAVDDCT.DTMVDRpcinr.rpnYYF.NnnFr.T 9 9 



|| : | | | I I I :: I I I I : I I I I I I II I II I I I MINIM: | MM I 
-RCDSEHWGPHCSNRCQCQNGALCNPITGACVCAPGFRGWRCEELCAPGTHGKGCQLL 15 6 



QY 


15 


Db 


8 


Qy 


75 


Db 


68 


QY 


135 


Db 


100 



Qy 



195 CQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGECS 2 54 



I I I : M : I I I I : I I : I I M I I I I : I I I I = 

Db 157 CQCHHGASCDPRTGECLCAPGYTGVYCEELCPPGSHGAHCELRCPCQNGGTCHHITGECA 216 

Qy 255 CPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVG 314 

i | || | || | I I I I I I : I I I I : I I I : I I II MINI: I I I : I I I : I I I I 

Db 217 CPPGWTGAVCAQPCPPGTFGQNCSQDCPCHHGGQCDHVTGQCHCTAGYMGDRCQEECPFG 27 6 

Qy 315 TYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLEN 374 

| : | M : : : I : I I I I I I : I I : I I I I I I I : I I I I I II 

D b 277 TFGFLCSQRCDCHNGGQCSPATGACECEPGYKGPSCQERLCPEGLHGPGCTLPCPCDTEN 336 

Qy 375 THSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGF 434 

| ||ll::l I I : I I I II I I I I : I hll I I I : I i I I I M I : I I I I I I I I I 

Db 337 TISCHPVTGACTCQPGWSGHYCNESCPAGYYGNGCQLPCTCQNGADCHSITGSCTCAPGF 396 

q y 435 KGIDCSTPCPLGTYGINCSSRCGCKNDAVCS PVDGSCTCKAGWHGVDCSIRCPSGTWGFG 494 

| | : I I I I M : I I I : I I I : I I I ! I i I 

Db 397 MGEVCAVPCAAGTYGPNCSSVCSCSNGGTCSPVDGSCTCREGWQGLDCSLPCPSGTWGLN 456 

Qy 495 CNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTT 554 

|| II I II II: I I : I I Ml I : I I I I I I I I : I I I I : I I I I I I I I I I I I 

Db 457 CNETCICANGAACSPFDGSCACTPGWLGDSCELPCPDGTFGLNCSEHCDCSHADGCDPVT 516 

q y 555 GHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRI 614 

| | | I I I I : I : I I ! I III : I I : I I I I I I : I 1 I M II I I I I II I I 

Db 517 GHCCCLAGWTGIRCDSTCPPGRWGPNCSVSCSCENGGSCSPEDGSCECAPGFRGPLCQRI 57 6 

Qy 615 CSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICT 674 

| | M I M I : I I I I I I I M I I II : I : I : I I I I I : I M I I : I I I I I : : I I : I : 

Db 577 CPPGFYGHGCAQPCPLCVHSRGPCHHISGICECLPGFSGALCNQVCAGGHFGQDCAQLCS 636 

Qy 675 CTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCT 734 

I I I I I I : I I I I I II : I I I II I I II II I 
Db 637 CANNGTCSPIDGSCQCFPGWIGKDCSQGCPSA 668 

Q y 735 PGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTY 794 

I : M I I I I I I I I I I II I I : I : I I I I I I I I I I I I I : I I I : 

Db 669 FFGKDCGHICQCQNGASCDHITGKCTCRTGFSGRHCEQRCAPGTF 713 

Qy 795 GYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALPADSY 854 

M | | : | : I : I : I I : I I I I : I I I I I I I I I : I I I I I I I : : - M : : I I I I : = 

Db 714 GYGCQQLCECMNNATCDHVTGTCYCSPGFKGIRCDQA-ALMMDELNPYTKISPALGAERH 772 

Qy 855 QIGAIAGIIILVLWLFLLALFIIYRHKQKGKESSM-PAVTYTPAMRWNADYTISGTLP 913 

: | | : | I : : I : : I : I I I I I : I I I : I I : I I I I I I :: I I - I 

Db 77 3 SVGAVTGIVLLLFLVWLLGLFAWRRRRQKEKGRDLAPRVSYTPAMRMTSTDYSLSDL-- 83 0 

Qy 914 HSNGGNANSHYFTNPSYHTLTQC ATS PHVNNRDRMTVTKSKNNQLFWLKNVNPGKR 97 0 

: : : : : I : I I I I I I I III : I I : I I I I I 
Db 831 — SQSSSHAQCFSNASYHTLA-CGGPATS-QASTLDRNSPTKLSNKSL DR 876 

Qy 971 GPVGDCTGTLPADWKHGGYLNELGAF GLDRSYMGK SLKDLGKNSE- YNSSNC 1021 

| I I : 1 I : I : I : | : | : : : I : : I 

Db 877 DTAG WTPYSYVNVLDSHFQI SALEARYPPEDFYIELRHLSRHAEPHSPGTC 927 



Qy 



1022 SLSSSENPY 1030 



Db 92 8 GMDRRQNTY 936 



RESULT 4 
Q8WUL3 

ID Q8WUL3 PRELIMINARY; PRT; 567 AA. 

AC Q8WUL3; 

DT 01-MAR-2002 (TrEMBLrel. 20, Created) 

DT 01-MAR-2002 (TrEMBLrel. 20, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Similar to MEGF10 protein. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=960 6; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Muscle; 

RA Strausberg R. ; 

RL Submitted (DEC-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC020198; AAH20198.1; -. 

DR GO; GO: 0005198; F: structural molecule activity; IEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002049; Laminin_EGF. 

DR Pfam; PF00008; EGF; 7. 

DR PRINTS; PR00011; EGFLAMININ. 

DR SMART; SM0018 0; EGF_Lam; 4. 

DR PROSITE; PS00022; EGF_1 ; 10. 

DR PROSITE; PS01186; EGF_2 ; 10. 

KW EGF-like domain; Laminin EGF-like domain. 

SQ SEQUENCE 567 AA; 60797 MW; CF2 FB8CDEB7CF62 7 CRC64; 

Query Match 51.4%; Score 3468; DB 4; Length 567; 

Best Local Similarity 99.8%; Pred. No. 1.6e-268; 

Matches 565; Conservative 1; Mismatches 0; Indels 0; Gaps 0; 



Qy 


i 


MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSC 

| | | | | | I I I I I I I I 1 II 1 1 1 M 1 1 1 1 1 1 1 M 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 

MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSC 


60 


Db 


i 


60 


QY 


61 


TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 

| | | II 1 1 1 II 1 1 1 1 1 II 1 M 1 M 1 1 i 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 

TDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIA 


12 0 


Db 


61 


12 0 


Qy 


121 


PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 

| M | | | | M 1 1 1 1 1 M 1 II II 1 1 M II 1 1 II M 1 1 II 1 1 1 M 1 1 1 M 1 M 1 1 1 1 1 1 1 1 II 

PNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCED 


180 


Db 


121 


180 


Qy 


181 


RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 

| | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 
RCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPC 


240 


Db 


181 


240 


Qy 


241 


QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 

1 I i M II 1 1 1 1! 1 1 1 II 1 1 1 1 II 1 1 1 1 M 1 1 I 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 M 1 1 

QNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSP 


300 


Db 


241 


300 


Qy 


301 


GYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 


360 



Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


A,, 

Qy 


A P 1 

4 O 1 


Db 


481 


Qy 


541 


Db 


541 



1 1 1 1 i [ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 

GYTGERCQDECPVGTYGVLCAETCQCWGGKCYHVSGACLCEAGFAGERCEARLCPEGLY 360 

GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 42 0 

I 1 I I I I I I 1 I I I I 11 I I I I I I II I I I ! I I I I I I I I 

GIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGAD 42 0 



I I I I I I I I I I I I I I M I I I I I I I I I I I I I II I II I I I I I I I I I M ! I I I! I I I I I I I I I I 



DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 54 0 

I I I I M I I I 1 I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I 

DCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAE 54 0 

RCDCSHADGCHPTTGHCRCLPGWSGV 566 

I I I I I I II I I I I I I I I I I I I I I I I I : 

RCDCSHADGCHPTTGHCRCLPGWSGL 5 66 



RESULT 5 




Q80T91 




t n 

1 JJ 


Q80T91 PRELIMINARY; PRT; 921 AA. 






Q80T91; 




JJ 1 


01-JUN-2003 ( TrEMBLrel . 24, Created) 




D I 


01-JUN-2003 (TrEMBLrel. 24, Last sequence update) 




L> X 


01-OCT-2003 (TrEMBLrel . 25, Last annotation update) 






MKIAA1781 protein (Fragment) . 




GN 


MKIAA17 81. 






Mus musculus (Mouse) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 


Euteleostomi ; 


oc 


Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae 


; Murinae; Mus . 


ox 


NCBI TaxID=10090; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RC 


TISSUE-Brain; 




RX 


MEDLINE=2 2 57 92 91; PubMed= 12693553; 




RA 


Okazaki N . , Kikuno R. , Ohara R. , Inamoto S., Aizawa 


H . , Yuasa S . , 


RA 


Nakajima D., Nagase T . , Ohara 0., Koga H.; 




RT 


"Prediction of the coding sequences of mouse homologues of KIAA gen> 


RT 


II. The complete nucleotide sequences of 4 00 mouse 


KIAA-homologous 


RT 


cDNAs identified by screening of terminal sequences 


of cDNA clones 


RT 


randomly sampled from size-fractionated libraries." 


} 


RL 


DNA Res. 10:35-48(2003). 




DR 


EMBL ; AK122555; BAC65837.1; 




DR 


GO; GO: 0016020; C:membrane; IEA. 




DR 


GO; GO: 0045285; C : ubiquinol-cytochrome-c reductase 


complex; IEA. 


DR 


GO; GO: 0005198; F: structural molecule activity; IEA. 


DR 


GO; GO: 0008121; F : ubi quinol - cy to chrome- c reductase 


activity; IEA. 


DR 


GO; GO: 0006118; P:electron transport; IEA. 




DR 


InterPro; IPR006209; EGF_like. 




DR 


InterPro; IPR006210; IEGF. 




DR 


InterPro; IPR00204 9; Laminin_EGF. 




DR 


InterPro; IPR005805; Rieske. 




DR 


Pfam; PF00008; EGF; 10. 




DR 


PRINTS; PR00011; EGFLAMININ . 




DR 


SMART; SM00181; EGF; 14. 





DR SMART; SM0018 0; EGF_Lam; 14. 

DR PROSITE; PS00022; EGF_1; 14. 

DR PROSITE; PS01186; EGF_2 ; 14. 

DR PROSITE; PS00200; RIESKE_2; 1. 

FT NONJTER 1 1 

SQ SEQUENCE 921 AA; 97316 MW; 60A3 4D9 5 1 3A6 00F7 CRC64; 

Query Match 49.1%; Score 3311.5; DB 11; Length 921; 

Best Local Similarity 59.6%; Pred. No. 9.6e-256; 

Matches 552; Conservative 110; Mismatches 233; Indels 31; Gaps 10; 

Qy 225 CPPGKHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCH 284 

M I I II M I I I t I I I I I I I : II I I : II I I I I I I I : I I I I : I II 

Db 1 CPPGSHGAHCELRCPCQNGGTCHHITGECACPPGWTGAVCAQPCPPGTFGQNCSQDCPCH 60 

Qy 285 NGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAG 344 

: M | I | | | I M : I I I : I I I : I I I I I : I I I I I I I I : I : I I I I I I 

Db 61 HGGQCDHVTGQCHCTAGYMGDRCQEECPFGTFGFLCSQRCDCHNGGQCSPATGACECEPG 12 0 

Qy 345 FAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGF 404 

: | | : M I I I I I : I I I I I I N II I h N I I N I I N I I I I N I : 
Db 121 YKGPSCQERLCPEGLHGPGCTLPCPCDTENTISCHPVTGACTCQPGWSGHYCNESCPAGY 180 

Q y 4 05 YGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVC 4 64 

|| || | : | | II I I I I : I I I I I I I I i I I : I I I I II II I I I I I I 
Db 181 YGNGCQLPCTCQNGADCHSITGSCTCAPGFMGEVCAVPCAAGTYGPNCSSVCSCSNGGTC 240 

Qy 465 SPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEK 524 

| || | | | | | | : | I I : I I I : I I I I I II I I I I I I I II : I I : I I I I I I : 

Db 241 SPVDGSCTCREGWQGLDCSLPCPSGTWGLNCNETCICANGAACS PFDGSCACTPGWLGDS 300 

Qy 52 5 CELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLP 58 4 

I I I I I I I I : i I I I : I I I I I II I I I I I I I I II I I : I : I I I I I I I : 

Db 301 CELPCPDGTFGLNCSEHCDCSHADGCDPVTGHCCCLAGWTGIRCDSTCPPGRWGPNCSVS 360 

Qy 585 CYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGL 644 

[ | : | | I M I : II I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I = I : 

Db 361 CSCENGGSCSPEDGSCECAPGFRGPLCQRICPPGFYGHGCAQPCPLCVHSRGPCHHISGI 42 0 

Qy 645 CDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCP 704 

|:||IM: : I I I M : : I I : I : I : I I I Nihil N 

Db 421 CECLPGFSGALCNQVCAGGHFGQDCAQLCSCANNGTCSPIDGSCQCFPGWIGKDCSQACP 480 

Qy 705 PAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNG 764 

|| | |||:MIII III II I I I I I I I I I : I I I I I I 1 = 1111 I I I I I I I 
Db 481 SGFWGSACFHTCSCHNGASCSAEDGACHCTPGWTGLFCTQRCPSAFFGKDCGHICQCQNG 540 

Qy 765 ADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQLCDCLNNSTCDHLTGTCYCSPGWK 824 

| | | | | : | : || I I II I I I I I I I : I I I : I N h h h h I h I I I h II I I I I I I h I 
Db 541 ASCDHITGKCTCRTGFSGRHCEQRCAPGTFGYGCQQLCECMNNATCDHVTGTCYCSPGFK 600 

Qy 825 GARC DQAGVI I VGN LN S L S RT S T AL PAD S YQ I G AI AG III LVLWL F L L AL F 1 1 Y RH KQ K 884 

| | | | | | : : : II : : I I I I : : : I h I h : h : h I I I I INI 
Db 601 G I RC DQ A- ALMMD E LN P YT K I S PAL GAE RH S VGAVT G I VL L L FL VWL LGL FAW RRRRQ K 659 

Qy 885 GKESSM-PAVTYTPAMRWNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQC— ATSP 940 
I : I I : : : I I : : I : = : : : I N I N I 



Db 660 EKGRDLAPRVSYTPAMRMTSTDYSLSDL SQSSSHAQCFSNASYHTLA-CGGPATS- 713 

Qy 941 HVNNRDRMTVTKSKNNQLFVNLKNVNPGKRGPVGDC TGTLPADWKHGGYLNEL 9 93 

: || : || | | : I II I I : : I I 

Db 714 QASTLDRNSPTKLSNKSLDRDTAGWTPYSYVNVLDSHFQISALEARYPPEDFYIELRHLS 773 

Qv 994 GAFGLDRS YMGKSLKDLGKNSEYNSSNCSLS SSENPYATIKDPPVLI P 1041 

| | ; | | || || II : I I I I I : I I I I I I I I I I M I : I 

Db 774 RHAEPHSPGTCGMDRRQNTYIMDKGFKDYMKESVCSSSTCSLNSSENPYATIKDPPILTC 833 

q v 1042 KSSECGYVEMKSPARRDSPYAEINNSTSANRNVYEVEPTVSWQGVFSNNGRLSQDPYDL 1101 

| | IN Ml :: : : : : I : I : I I M I I I I I I I :| hllM 

Db 834 KLPESSYVEMKSPVHLGSPYTDVPSLSTSNKNI YEVEPTVSVVQEGRGHNSSYIQNPYDL 893 

Qy 1102 PKNSHIPCHYDLLPVRDS-SSSPKQE 1126 

I I M M I I I I II I I I I : Ml 
Db 894 PKNSHIPGHYDLLPVRQSPAHGPFQE 919 



RESULT 6 
Q8VHL7 

ID Q8VHL7 PRELIMINARY; PRT; 1034 AA. 

AC Q8VHL7; 

DT 01-MAR-2002 (TrEMBLrel . 20, Created) 

DT 01-MAR-2002 (TrEMBLrel . 20, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Jedi protein. 

GN 3110045G13RIK. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL; TISSUE=Tes tis ; 

RA Krivtsov A.V. , Zinovyeva M.V. , Hendrikx J . , Visser J.W.M., 

RA Belyavsky A.V. ; 

RT "Jedi is a novel DSL and EGF-like repeat motif-containing protein 

RT expressed on non-differentiated hematopoietic cells."; 

RL Submitted (NOV-2001) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AF444274; AAL38571.1; -. 

DR MGD; MGI:1920432; 3110045G13Rik . 

DR GO; GO: 0005198; F: structural molecule activity; IEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR009030; Grow_f ac_recep . 

DR InterPro; IPR002049; Laminin_EGF. 

DR Pfam; PF00008; EGF; 6. 

DR PRINTS; PR00011; EGFLAMININ . 

DR SMART; SM0018 0; EGFJLam; 4. 

DR PROSITE; PS00022; EGFJL; 13. 

DR PROSITE; PS01186; EGF_2 ; 12. 

KW EGF-like domain; Laminin EGF-like domain. 

SQ SEQUENCE 1034 AA; 110540 MW; 5514E5166AE01111 CRC64 ; 



Query Match 39.6%; Score 2668; DB 11; 

Best Local Similarity 42.7%; Pred. No. 2.9e-204; 
Matches 493; Conservative 110; Mismatches 379; 



Length 1034; 

Indels 172; Gaps 16 



Qy 


14 


LLLCHWIGTASPLNLEDPNVCSHWES YSVTVQESYPHPFDQI YYTSCTDILNW FKCT 

1 M : II 1 1 1 1 1 : 1 1 1 : : 1 : 1 1 • ' ' ■ II 1 1 
LLLALGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRPFSLLPAESCH— RPWEDPHTCA 


70 


Db 


7 


64 


Qy 


71 


RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 

i i iii i iii iii ii . i i i i i i . i i t iii iii 
: 1 | 1 1 1 1 1 1 : 1 1 1 h 1 1 1 1 1 1 1 1 : 1 1 M 1 1 : 1 1 i 1 1 1 1 1 1 

QPTWYRTWRQWKMDSRPRLQCCRGYYESRGACVPLCAQECVHGRCVAPNQCQCAPGW 


130 


Db 


65 


124 


Qy 


131 


GGTNCS SACDGDHWGPHCT S RCQCKNGALCNP ITGACHCAAGFRGWRCEDRCEQGTYGND 

■ iiii ill 1 . 1 . 1 1 I 1 • 1 • 1 1 ill 

I : 1 1 1 1 IIII 1 1 1 : 1 : 1 : 1 1 1 1 • 1 • 1 1 III 
RGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGACFCPSGLQPPNCLQPCPAGHYGPA 


190 


Db 


125 


184 


Qy 


191 


CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 

■ I i i i i i i i i i i i i i 
I Ml 1 1 : II 1 1 1 1 1 I 1 1 1 1 1 : 1 1 1 1 M 1 
CQFDCQCY-GASCDPQDGACFCPPGRAGPSCNVPCSQGTDGFFCPRTYPCQNGGVPQGSQ 


250 


Db 


185 


243 


Qy 


251 


GECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCD7\ATGQCHCSPGYTGERCQDE 

, ,1 iii i r t i i ii 11)1(1.111 l.lll.l 

| | | | | | M | : I I 1 1 I 1 1 II : 1 1 1 : M 1 1 1 II 1 II II 1 : 1 1 1 1 • 1 1 1 ■ 1 

GSCSCPPGWMGVICSLPCPEGFHGPNCTQECRCHNGGLCDRFTGQCHCAPGYIGDRCQEE 


310 


Db 


244 


303 


Qy 


311 


CPVGT YGVLCAETCQCVNGGKC YHVS GACLCEAG FAGERCEARLC P EGL YGI KCDKRC PC 

| | | | : | I I 1 1 1 1 1 : 1 : : 1 1 M 1 1 1 1 1 : II 1 1 1 1 : 1 1 1 = 1 = 1 1 
CPVGRFGQDCAETCDCAPGARCFPANGACLCEHGFTGDRCTERLCPDGRYGLSCQEPCTC 


370 


Db 


304 


363 


Qy 


371 


HLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTC 

■ ■ i ■ i i i i i i ii II 1 1**111 

DPEHSLSCHPMHGECSCQPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGLCLADSGLCRC 


430 


Db 


364 


423 


Qy 


431 


APGFKGIDCSTPCPLGTYGINCS SRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGT 

ii i i i i i i i i i i i i iii. ii. i ii ii iii. ii ii 
|||: | | : II 1 1 1 II II 1 1 1 1 : 1 1 1 1 : 1 1 : 1 1 1 M -II- M II 

APGYTGPHCANLCPPDTYGINCSSRCSCENAIACSPIDGTCICKEGWQRGNCSVPCPLGT 


490 


Db 


424 


483 


Qy 


491 


WGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGC 

■ i iii iii i i iii I .1 II 1 T I 1 • 1 1 i 

| | | | | : I 1 1 : 1 1 : 1 1 1 1 1 1 1 1 hill 1 : 1 II 1 I 1 1 • 1 1 1 

WGFNCNASCQCAHDGVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGC 


550 


Db 


484 


543 


Qy 


551 


HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCS PDDGICECAPGFRGTT 

ii i i i ii it iii i lilt .1 ..1 i I 1 1 1 1 1 1 . 

| | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 :: 1 1 1 1 1 1 1 II • 
DPVHGQCRCQAGWMGTRCHLPCPEGFWGANCSNTCTCKNGGTCVSENGNCVCAPGFRGPS 


610 


Db 


544 


603 


Qy 


611 


CQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGETGALCNEVCPSGRFGKNCA 


670 


Db 


604 


Ml 1 II M II 1 


619 


Qy 


671 


GICTCTNN-GTCNPIDRSCQCYPGW1GSDCSQPCPPAHWGPNC1HTCNCHNGAFCSAYDG 

.iii iii ii I 1 1 1 1 1 . 1 1 1 t ! 1 1 111*1 1 1 I 

| | 1 1 : I : I 1 : 1 1 1 1 1 II 1 : 1 1 1 II 1 1 1 1 1 • 1 1 M 
— CKCNNNHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTCHPQDG 


729 


Db 


620 


677 


Qy 


730 


ECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKC 


789 


Db 


678 


1 1 1 1 1 M 1 1 : 1 1 : 1 : 1 : : 1 II 1 


712 


Qy 


790 


PSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVI1VGNLNSLSRTSTAL 

1 II 1 1 II II 1 : 1 : 1 : 1 : 
EMCHPQTGACVCPPGHSGADCK MGSQESFTIMPTS- 


849 


Db 


713 


747 



Qy 850 PADS YQIGAI AGI 1 1 LVLWLFLLALFI I YRHKQKGKES SMPAVT YTPAMRWNADYT I S 909 

| : i I : I I : I : I : I : I I Mil II I : I : : I I : 

Db 748 PVTHNSLGAVIGIAVLGTLWALIALFIGYRQWQKGKEHEHLAVAYSTG-RLDGSDYVMP 806 

Qy gl0 GTLPH SNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKNVNPGK 969 

| : | I I : : I I I I I I I : I I : : I I I MUM: 

Db 807 DVSP SYSHYYSNPSYHTLSQCSPNPPPPN KVPGSQLFVSSQAPERPS 853 

Q y 970 RGPVGDCTGTLPADWKHGGYLNELGAFGLDRSY MGKS 1006 

| I I I I I I I I =1 M HIM Ml 

Db 854 RAHGRENHVTLPADWKHRREPHERGASHLDRS YSCSYSHRNGPGPFCHKGPISEEGLGAS 913 

Qy 10 07 LKDLGKNSEYNSSNCSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINN 1066 

: | I I I I M I M M I I : I M II I I I I : : 

914 VMSL SSENPYATIRDLPSLPGEPRESGYVEMKGPPSVSPPRQSLH- 958 



Db 

Qy 



1067 STSANRNVYEVEP TVSWQGVFSNNGRLSQDP YDLPKNSHI PCHY 1111 

: | : : : | I : I I I I I I I I I M I I I 

Db 959 — LRDRQQRQLQPQRDSGTYEQPSPLSHNEESLGSTPPLPPGLPPGQYDSPKNSHIPGHY 1016 



Qy 1112 DLLPVRDSSSSPKQ 1125 

II Ml II: 

Db 1017 DLPPVRHPPSPPSR 1030 



RESULT 7 
Q8VIK5 

ID Q8VIK5 PRELIMINARY; PRT; 1034 AA. 

AC Q8VIK5; 

DT 01-MAR-2 002 (TrEMBLrel. 20, Created) 

DT 01-MAR-2Q02 (TrEMBLrel. 20, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE MEGF12. 

GN 3110045G13RIK OR MEGF12 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI_TaxID=1009 0; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN— C57BL/ 6; TISSUE=Liver ; 

RA Ivanova N.B., Lemischka I.R.; 

RT "The global gene expression profiling of the hematopoietic stem 

RT cell . " ; 

RL Submitted (OCT-2001) to the EMBL/GenBank/DDB J databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Eye; 

RX MEDLINE=22354683; PubMed=12 4 668 5 1 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; AF440279; AAL33583.1; 

DR EMBL; AK053551; BAC35426.1; 

DR MGD; MGI:1920432; 3110045G13Rik . 



DR GO; GO: 0005198; F: structural molecule activity; IEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR009030; Grow_f ac_recep . 

DR InterPro; IPR002049; Laminin^EGF. 

DR Pfam; PF00008; EGF; 6. 

DR PRINTS; PR00011; EGFLAMININ. 

DR SMART; SM0018 0; EGF_Lam; 4. 

DR PROSITE; PS00022; EGF_1 ; 13. 

DR PROSITE; PS01186; EGF_2 ; 12. 

KW EGF-like domain; Laminin EGF-like domain. 

SQ SEQUENCE 1034 AA; 110580 MW; 7 14E501684 8E4E4C CRC64; 

Query Match 39.5%; Score 2667; DB 11; Length 1034; 

Best Local Similarity 42.8%; Pred. No. 3.5e-204; 

Matches 494; Conservative 110; Mismatches 378; Indels 172; Gaps 17; 

Qy 14 LLLCHWIGTASPLNLEDPNVCSHWESYS VTVQESYPHPFDQI YYTSCTDILNW FKCT 70 

Ml : II I i I I I : I I I : : i : I I : I 1 : II I I 

Db 7 LLLALGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRPFSLLPAESCH— RPWEDPHTCA 64 

q y 71 RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 13 0 

: | I I I I I I I : I II I : I I I : I I I I I I : I Ill 

Db 65 QPTWYRTVYRQWKMDSRPRLQCCRGYYESRGACVPLCAQECVHGRCVAPNQCQCAPGW 12 4 

Qy 131 GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 190 

| : | | | | I I I I I I I : I : I : I I I : I : I I I I I 

Db 125 RGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGTCFCPSGLQPPNCLQPCPAGHYGPA 184 

Qy 191 CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 250 

| |||||:|| I I I I I I I I I I I I : M I I M I 

Db 185 CQFDCQCY-GASCDPQDGACFCPPGRAGPSCNVPCSQGTDGFFCPRTYPCQNGGVPQGSQ 243 

Qy 251 GECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDE 310 

| | | M M | | : | I I I I I I I I : I II : I I I II II M M I I : I I I I : I I I : I 
Db 244 GSCSCPPGWMGVICSLPCPEGFHGPNCTQECRCHNGGLCDRFTGQCHCAPGYIGDRCQEE 303 

Qy 311 CPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPC 37 0 

| I M : I I I I I I I I : I : : I I I I I I M I : II I I I I : I I I : I : I I 

Db 304 CPVGRFGQDCAETCDCAPGARCFPANGACLCEHGFTGDRCTERLCPDGRYGLSCQEPCTC 363 

Qy 371 HLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTC 430 

I : : I I I I I I I I : I : I I I : I I : I II : I : I I I : I I : I I : : I I I 

Db 364 DPEHSLSCHPMHGECSCQPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGLCLADSGLCRC 423 

Qy 431 APGFKGIDCSTPCPLGTYGINCS SRCGCKN DAVCSPVDGSCTCKAGWHGVDCSIRCPSGT 490 

I | | : | I : I I I I I I I I : I I II : I I : I I I I I : I I : I I I I 

Db 424 APGYTGPHCANLCPPDTYGINCSSRCSCENAIACSPIDGTCICKEGWQRGNCSVPCPLGT 483 

Qy 491 WGFGCNLTCQCLNGGACNTLDGTCT CAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGC 55 0 

| | | | | : | | | : I | : I I II I I I I 1 = 111 I : I II I I I I : I I I 

Db 484 WGFNCNASCQCAHDGVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGC 543 

Q y 551 HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICEGAPGFRGTT 610 

| | | | | Ml | I II II III 11111:1 :: I I I I I I II I : 
Db 544 DPVHGQCRCQAGWMGTRCHLPCPEGFWGAKCSNTCTCKNGGTCVSENGNCVCAPGFRGPS 603 

Qy 611 CQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCA 67 0 



1 1 1 I I i 1 1 1 1 I 

Db 604 CQRPCPPGRYGKRCVQ- 



Db 

Qy 



Db 

Qy 



Qy 671 GICTCTNN-GTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDG 729 

I I I I : I : I I : I : I I I I I I I I I I : I I I I 

Db 620 — CKCNNNHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTCHPQDG 677 

Qy 730 ECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKC 789 

[ I I I I I I I I : I I : I : I : : I I I I I I 1 
Db 67 8 SCICTPGWTGPNCLEGCPPRMFGVNCSQLCQCDLGEMC HPE 718 

Qy 790 PSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTAL 849 

II I I II II I : | : | : | : 

Db 719 TGACVCPPGHSGADCK MGSQESFTIMPTS- 747 

Qy 850 PADS YQI GAI AGI I ILVLWLFLLALFI I YRHKQKGKES SMPAVTYTPAMRWNADYTI S 909 

| : I | : I I : I : I : I : I I I I I I II : I = : ' I 1 

D b 748 PVTHNSLGAVIGIAVLGTLWALIALFIGYRQWQKGKEHEHLAVAYSTG-RLDGSDYVMP 8 06 

Qy 910 GTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKNVNPGK 969 

I : I II : : M I I I I I : II : : I I I = I I I I : : 

8 07 DVSP SYSHYYSNPSYHTLSQCSPNPPPPN KVPGSQLFVSSQAPERPS 853 



97 0 RGPVGDCTGTLPADWKHGGYLNELGAFGLDRSY MGKS 100 6 

I : I I I I I I I I : I I I M I I I : I I 

Db 854 RAHGRENHVTLPADWKHRREPHERGASHLDRSYSCSYSHRNGPGPFCHKGPISEEGLGAS 913 



Q y 1007 LKDLGKNSEYNSSNCSLSSSENPYATIKDPPVLIPKSSECGYVEMKSPARRDSPYAEINN 1066 

: | I I I I I I I I I : I I I : I I I M I I I I 

914 VMSL SSENPYATIRDLPSLPGEPRESGYVEMKGPPSVSPPRQSLH- 958 



1067 STSANRNVYEVEP TVSWQGVFSNNGRLSQDP YDLPKNSHIPCHY 1111 

: | : : : | I : I I I I I I I I I I I I I I 

Db 959 — LRDRQQRQLQPQRDSGTYEQPSPLSHNEESLGSTPPLPPGLPPGHYDSPKNSHIPGHY 1016 



Qy 1112 DLLPVRDSSSSPKQ 1125 

I I I I I II: 
Db 1017 DLPPVRHPPSPPSR 1030 



RESULT 8 
Q8CGA7 

ID Q8CGA7 PRELIMINARY; PRT; 1004 AA. 

AC Q8CGA7; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel . 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Similar to RIKEN cDNA 3110045G13 gene. 

GN 3110045G13RIK. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=EVB/N; 

RA Strausberg R. ; 



RL Submitted (JAN-2 003) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC042490; AAH42490.1; 

DR MGD; MGI:1920432; 3110045Gl3Rik. 

DR GO; GO: 0005198; F: structural molecule activity; IEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR009030; GrowJac_recep. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR002049; Laminin^EGF. 

DR Pfam; PF00008; EGF; 6. 

DR PRINTS; PR00011; EGFLAMININ. 

DR SMART; SM00181; EGF; 14. 

DR SMART; SM0018 0; EGF^Lam; 12. 

DR PROSITE; PS00022; EGF_1 ; 12. 

DR PROSITE; PS01186; EGF_2 ; 11. 

SQ SEQUENCE 1004 AA; 107377 MW; 95 0 8B0EC0 4 56 1E94 CRC64; 

Query Match 37.53; Score 2529; DB 11; Length 1004; 

Best Local Similarity 41.3%; Pred. No. 3.6e-193; 

Matches 477; Conservative 105; Mismatches 368; Indels 206; Gaps 



Qy 


14 


LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQI YYTSCTDILNW FKCT 

Ml : II 1 1 1 1 1 : 1 1 1 : : 1 : 1 1 : 1 1 = M 1 1 

LLLALGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRPFSLLPAESCH RPWEDPHTCA 


70 


Db 


7 


64 


Qy 


71 


RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 

: | Ml M 1 1 : II 1 1 : II 1 1 II 1 1 : 1 1 1 1 1 1 : 1 1 1 1 1 1 1 I 1 
QPTWYRTVYRQWKMDSRPRLQCCRGYYESRGACVPLCAQECVHGRCVAPNQCQCAPGW 


130 


Db 


65 


12 4 


Qy 


131 


GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 

| : | | | | 1 II 1 I 1 1 : 1 : 1 : 1 1 1 1 : 1 = 1 1 III 

RGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGACFCPSGLQPPNCLQPCPAGHYGPA 


•inn 

190 


Db 


125 


184 


Qy 


191 


CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 

| | I | 11:11 I I 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 
CQFDCQCY-GASCDPQDGACFCPPGRAGPSCNVPCSQGTDGFFCPRTYPCQNGGVPQGSQ 


250 


Db 


185 


243 


Qy 


251 


GECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDE 

| M | | | M I : 1 1 1 1 1 1 111:111:1111111 1 : : : 

GSCSCPPGWMGVICSLPCPEGFHGPNCTQECRCHNGGLCDRFTGQCHCAPGYIGDRCQEE 


310 


Db 


244 


303 


Qy 


311 


CPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPC 

| | | | : | Mill 1 1 : 1 : : 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 M 1 1 1 : M 1 1 

CPVGRFGQDCAETCDCAPGARCFPANGACLCEHGFTGDRCTERLCPDGRYGLSCQEPCTC 


370 


Db 


304 


363 


Qy 


371 


HLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTC 
|:: MIM 1 1 :l h H II 


430 


Db 


364 


393 


Qy 


431 


APGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGT 

|| | : I | : II M 1 1 M 1 1 M 1 : 1 II 1 M 1 M II II M 1 : II 1 1 
APGYTGPHCANLCPPDTYGINCSSRCSCENAIACSPIDGTCICKEGWQRGNCSVPCPLGT 


490 


Db 


394 


453 


Qy 


491 


WGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGC 

|| | || : | || : I I : 1 1 1 1 II 1 1 Mill Ml 1 1 1 M M II 1 
WGFNCNASCQCAHDGVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGC 


550 


Db 


454 


513 



Qy 



551 HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTT 610 



I IN I I I I MM :| :: M : 

Db 514 DPVHGQCRCQAGWMGTRCHLPCPEGFWGANCSNTCTCKNGGTCVSENGNCVCAPGFRGPS 573 

Qy 611 CQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCA 670 

I I I I II I I I I I 

Db 574 CQRPCPPGRYGKRCVQ — 589 

Qy 671 GICTCTNN-GTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDG 72 9 

I I I I : | : | | : | | I I I M I : M I M I I I M : I I I I 
Db 590 --CKCNNNHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTCHPQDG 647 

Qy 730 ECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKC 789 

I I I M I I I I : I I : | : | : : M I I I II 

Db 64 8 SCICTPGWTGPNCLEGCPPRMFGVNCSQLCQCDLGEMC HPE 68 8 

Qy 790 PSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTAL 849 

I I I I II III : | : | : I : 

Db 68 9 TGACVCPPGHSGADCK MGSQESFTIMPTS- 717 

Qy 850 PADS YQI GAIAGI 1 1 LVLWLFLLALFI I YRHKQKGKES SMPAVT YTPAMRVVNADYTI S 909 

| : [ ! : M : I M : I : M I I M Mill MM M M I : 

Db 718 PVTHNSLGAVIGIAVLGTLWALIALFIGYRQWQKGKEHEHLAVAYSTG-RLDGSDYVMP 776 

Qy 910 GTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKNVNPGK 969 

I : M I : : I M I I I I : I I : M I I Mil: : 

Db 777 DVSP SYSHYYSNPSYHTLSQCSPNPPPPN KVPGSQLFVSSQAPERPS 823 



Qy 



97 0 RGPVGDCTGTLPADWKHGGYLNELGAFGLDRSY MGKS 10 06 

I : I I I I I II I :: I I I I M I Ml 

Db 824 RAHGRENHTTLPADWKHRREPHDRGASHLDRSYSCSYSHRNGPGPFCHKGPISEEGLGAS 883 



Qy 1007 LKDLGKNSEYNSSNCSLS SSENPYATIKDPPVLI PKSSECGYVEMKS PARRDSPYAEINN 1066 

: | II M II M M I I I : I I M M I I I : : 

Db 884 VMSL SSENPYATIRDLPSLPGEPRESGYVEMKGPPSVSPPRQSLH- 928 

q y 1067 STSANRNVYEVEPTVSWQGVFSNNGRLSQDP YDLPKNSHI PC 1109 

: | : : : | I : I I : II I II II II 

Db 929 — LRDRQQRQLQPQRD — SGTYEQPSPLSHNEESLGSTSPLPPGLPPGHYDSPKNSHIPG 984 

Qy 1110 HYDLLPVRDSSSSPKQ 1125 

I II i I II II: 
Db 985 HYDLPPVRHPPSPPSR 1000 



RESULT 9 
Q8VHF4 

ID Q8VHF4 PRELIMINARY; PRT; 747 AA. 

AC Q8VHF4; 

DT 01-MAR-2002 (TrEMBLrel. 20, Created) 

DT 01-MAR-2002 (TrEMBLrel. 20, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Jedi-736 protein. 

GN 3110045G13RIK. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI TaxID-10090; 



RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL; TI SSUE=Tes tis ; 

RA Krivtsov A.V., Zinovyeva M.V. , Hendrikx J., Visser J.W.M., 

RA Belyavsky A.V. ; 

RT "Jedi is a novel DSL and EGF-like repeat moti f -containing protein 

RT expressed on non-differentiated hematopoietic cells."; 

RL Submitted (DEC-2001) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AF461685; AAL66380.1; 

DR MGD; MGI : 1920432 ; 3 110045G13Rik . 

DR GO; GO: 0005198; F: structural molecule activity; IEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR009030; Grow_f ac_recep . 

DR InterPro; IPR002049; Laminin_EGF. 

DR Pfam; PF00008; EGF; 6. 

DR PRINTS; PR00011; EGFLAMININ. 

DR SMART; SM00180; EGF^Lam; 4. 



DR 
DR 
KW 
SQ 


PROSITE; 
PROSITE; 
EGF-like 
SEQUENCE 


PS00022; EGF_1; 13. 
PS01186; EGF_2 ; 12. 
domain; Laminin EGF-like domain. 

747 AA; 78972 MW; F82 5F8 F3 8 4 D4 7 3 6A CRC64; 




Query Match 34.0%; Score 2292.5; DB 11; Length 747; 
Best Local Similarity 48.5%; Pred. No. 1.9e-174; 

Matches 377; Conservative 75; Mismatches 272; Indels 53; Gaps 


5; 


Qy 


14 


LLLCHWIGTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYTSCTDILNW FKCT 

Ml : M M 1 1 1 : 1 i 1 : : 1 : 1 1 : II = 1 1 1 1 

LLLALGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRPFSLLPAESCH RPWEDPHTCA 


70 


Db 


7 


64 


Qy 


71 


RHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTCQCEPGW 

: | Ml M 1 1 : M 1 1 : 1 1 : 1 1 1 1 II : 1 1 1 1 1 1 1 M 

QPTWYRTVYRQWKMDSRPRLQCCRGYYESRGACVPLCAQECVHGRCVAPNQCQCAPGW 


130 


Db 


65 


124 


Qy 


131 


GGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGND 

| : M 1 1 1 1 1 1 1 1 1 : 1 : 1 : 1 M 1 : 1 : 1 1 1 1 1 

RGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGACFCPSGLQPPNCLQPCPAGHYGPA 


190 


Db 


125 


184 


Qy 


191 


CHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVT 

| Ml I 1 : 1 i 1 1 1 1 1 1 1 : 1 

CQFDCQCY-GASCDPQDGACFCPPGRAGPSCNVPCSQGTDGFFCPRTYPCQNGGVPQGSQ 


250 


Db 


185 


243 


Qy 


251 


GECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDE 

111111111:1 111:111:111 1 M 1 : M 1 1 : 1 1 1 : 1 

GSCSCPPGWMGVICSLPCPEGFHGPNCTQECRCHNGGLCDRFTGQCHCAPGYIGDRCQEE 


310 


Db 


244 


303 


Qy 


311 


CPVGTYGVLCAETCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPC 

| | | | : I Mill 1 1 : 1 : : 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 : 1 1 1 : 1 : 1 1 
CPVGRFGQDCAETCDCAPGARCFPANGACLCEHGFTGDRCTERLCPDGRYGLSCQEPCTC 


370 


Db 


304 


363 


Qy 


371 


HLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTC 

| : : | | | | | II 1 : 1 : 1 1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 : 1 1 : 1 1 : : 1 1 1 
DPEHSLSCHPMHGECSCQPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGLCLADSGLCRC 


430 


Db 


364 


423 



Qy 

Db 



4 31 APGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGT 4 90 

| | | : | I : II I I I I I I I I I I I : I I I I : I I : I II I I : I I : I I I I 
424 APGYTGPHCANLCPPDTYGINCSSRCSCENAIACSPIDGTCICKEGWQRGNCSVPCPLGT 483 



Qy 


491 


Db 


484 


Qy 


551 


Db 


544 


Qy 


611 


Db 


604 


Qy 


D / -L 


Db 


620 


Qy 


730 


Db 


678 



: | | | : I I : I I I I I I I I hill I : I 1 = 111 

WGFNCNASCQCAHDGVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGC 

HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTT 
| | | I | Ml I M M:l =: I I I I = 



I II I I I I I I I I 

rnDDrDor'Dvr'^Drvn 619 



MM : I : I I : I I I I I I I I : I I I I I I I : I 

-CKCNNNHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTC: 

1CKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHI SGQCTCRTGFMGRH 1 
| | | | | | | | |:| I : I : I : : II I I I : I I I I I 



RESULT 10 
Q8ND91 



J. u 


Q8ND91 PRELIMINARY; PRT; 626 AA. 






Q8ND91; 




u ± 


01-OCT-2002 (TrEMBLrel. 22, Created) 




u ± 


01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 




DT 


01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 




DE 


Hypothetical protein (Fragment) . 




GN 


DKFZP434L121. 




OS 


Homo sapiens (Human) . 


Euteleostomi ; 


OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 


OC 


Mammalia; Eutheria; Primates; Catarrhini; Hominidae 


; Homo . 


OX 


NCBI TaxID=9606; 




RN 


[1] 




RP 


SEQUENCE FROM N . A. 




RC 


TISSUE=Testis; 




RA 


Poustka A., Wellenreuther R. , Mewes H.W. , Weil B . , 


Wiemann S . ; 


RL 


Submitted (JUL-2002) to the EMBL/ GenBank/DDBJ databases. 


DR 


EMBL; AL834326; CAD38994.1; -. 




DR 


GO; GO: 0016020; C '.membrane ; IEA. 




DR 


GO; GO: 0045285; C : ubiquinol-cytochrome-c reductase 


complex; IEA. 


DR 


GO; GO: 0005198; F: structural molecule activity; IEA. 


DR 


GO; GO: 0008121; F : ubiquinol-cytochrome-c reductase 


activity; IEA. 


DR 


GO; GO: 0006118; P:electron transport; IEA. 




DR 


InterPro; IPR006209; EGF^like. 




DR 


InterPro; IPR006210; IEGF. 




DR 


InterPro; IPR002049; Laminin_EGF . 




DR 


InterPro; IPR005805; Rieske. 




DR 


Pfam; PF00008; EGF; 7. 




DR 


PRINTS; PR00011; EGFLAMININ. 




DR 


SMART; SM00181; EGF; 11. 




DR 


SMART; SM0018 0; EGF_Lam; 11. 




DR 


PROSITE; PS00022; EGF^l ; 11. 




DR 


PROSITE; PS01186; EGF_2 ; 11. 




DR 


PROSITE; PS00200; RIESKE_2; 1. 





KW Hypothetical protein; EGF-like domain; Laminin EGF-like domain. 
FT NON__TER 1 1 

SQ SEQUENCE 626 AA; 64059 MW; C166FE1BD2A949F9 CRC64; 

Query Match 31.7%; Score 2135.5; DB 4; Length 626; 

Best Local Similarity 60.6%; Pred. No. 5.4e-l62; 

Matches 339; Conservative 61; Mismatches 144; Indels 15; Gaps 4; 

Qy ?61 GTVCGQPCPF.GRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLC 320 

I I I I M I I I I : I I I I : I I I : I I I I I I I I I I : I I I : M 1 : I I I I : : I I 
Db 6 GAVCAQPCPPGTFGQNCSQDCPCHHGGQCDHVTGQCHCTAGYMGDRCQEECPFGSFGFQC 65 

Qy 321 AETCQCWGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHP 380 

: : | | M I : I : I I I II I : I I I : I I I I I II : I I III : I I I I I I 

Db 66 SQRCDCHNGGQCSPTTGACECEPGYKGPRCQERLCPEGLHGPGCTLPCPCDADNTISCHP 125 

Qv 381 MSGECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCS 440 

:: | | I : I I I I I : I I I : I I : I I : I I I : I I I I I I I I : I I I I I I I I I I I : 
Db 126 VTGACTCQPGWSGHHCNESCPVGYYGDGCQLPCTCQNGADCHSITGGCTCAPGFMGEVCA 185 

Qy 441 TPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQ 500 

| | | II I I I I I I I I I I I I I I I I I I I I I - I I I M ! I II I I : I 

Db 186 VSCAAGTYGPNCSSICSCNNGGTCSPVDGSCTCKEGWQGLDCTLPCPSGTWGLNCNESCT 245 

Qv 501 CLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCL 560 

| || ||: : M : | : I III I : Mill I I I : I I I I = I MINIMI I I II I II 
Db 246 CANGAACSPIDGSCSCTPGWLGDTCELPCPDGTFGLNCSEHCDCSHADGCDPVTGHCCCL 305 

q v 561 PGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFY 620 

| | : | : | | | | : I I: : I I II I M I 

Db 306 AGWTGIRCDSTCPPGRWGPNCSVSCSCENGGSCSPEDGSCECAPGFRGPLCQRICPPGFY 365 

Qy 621 GHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGT 68 0 

M I : I II I I I M I I I I I : I : I : I I I I I = • : : I I : I : I I I I I 

Db 366 GHGCAQPCPLCVHSSRPCHHISGICECLPGFSGALCNQVCAGGYFGQDCAQLCSCANNGT 425 

Qy 681 CNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGL 740 

|:||| 111:111 I Ml Ml I I MINI III I I II I I 

Db 426 CSPIDGSCQCFPGWIGKDCSQACPPGFWGPACFHACSCHNGASCSAEDGACHCTPGWTGL 4 85 

Qv 741 YCTQRCPLGFYGKDCALI CQCQNGADCDHI S GQCTCRTGFM GRHCEQKCPSGTYGY 796 

= ||M | : : I I I : I I I I I I I : I 

Db 486 FCTQRKPHLLASQPLRIPC-CGLLATVGIVQ TSREGGMQAAPGLWPDSCPTRTEEL 541 

Qy 797 GCRQICDCLNNSTCDHITG 815 

I : I I I I 
Db 542 CRGSSRPDWIQG 553 



RESULT 11 
088281 

ID 088281 PRELIMINARY; PRT; 1574 AA. 

AC 088281; 

DT 01-NOV-1998 (TrEMBLrel. 08, Created) 
DT 01-NOV-1998 (TrEMBLrel. 08, Last sequence update) 
DT 01-OCT-2003 (TrEMBLrel . 25, Last annotation update) 
DE MEGF6. 



GN 
OS 
OC 

oc 
ox 

RN 
RP 
RC 
RX 
RA 
RT 
RT 
RL 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
SQ 



MEGF6. 

Rattus norvegicus (Rat) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 
NCBI_TaxID=10116; 
[1] 

SEQUENCE FROM N.A. 

STRAIN=Sprague-Dawley; TI SSUE=Brain ; 
MEDLINE-9 8 3 60 08 9; PubMed=9 693 030 ; 

Nakayama M. , Nakajima D . , Nagase T. f Nomura N., Seki N., Ohara O. ; 

"Identification of high-molecular-weight proteins with multiple EGF- 

like motifs by motif-trap screening."; 

Genomics 51:27-34(1998). 

EMBL; AB011532; BAA324 62.1; 

PIR; T13954; T13954. 

HSSP; P00736; 1APQ . 

GO; GO: 0005509; F: calcium ion binding; IEA. 
GO; GO: 0005198; F: structural molecule activity; IEA. 
InterPro; IPR000152; Asx_hydroxyl_S . 
InterPro; IPR001881; EGF__Ca . 
InterPro; IPR006209; EGF__like. 
InterPro; IPR002049; Laminin_EGF. 
Pfam; PF00008; EGF; 20. 
PRINTS; PR00011; EGFLAMININ. 
SMART ; SM0017 9; EGF_CA; 4. 
PROSITE; PS 0 0010; ASX JiYDROXYL ; 5. 
PROSITE; PS00022; 
PROSITE; PS01186; 
PROSITE; PS01187; 
EGF-like domain. 
SEQUENCE 1574 AA; 



EGF_1; 23. 
EGF__2; 23. 
EGF CA; 5. 



165445 MW; 2B4 8 533D8 F77F6E7 CRC64; 



Query Match 29.0%; Score 1958; DB 11 

Best Local Similarity 41.3%; Pred. No. 2.7e-147 
Matches 344; Conservative 77; Mismatches 306 



Length 1574; 

Indels 106; Gaps 16; 



Qy 



Db 



95 CP-GFYESGEMCVP--HCADKCVHGRC-IAPNTCQCEPGWGGTNCSSACDGDHWGPHCTS 150 

| | I I I | : I III:: ill I I : M I I M : I I I : 

602 CPKGFY— GKHCRKKCHCANR GRCHRLYGACLCDPGLYGRFCHLACPPWAFGPGCSE 65 6 



Qy 



Db 



151 RCQCKNG— ALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTG 2 08 

| | : I | I I : I I I I I : I I I : I I I : I I II M I I I I : I 

657 DCLCEQSHTRSCNPKDGSCSCKAGFQGERCQAECESGFFGPGCRHRCTCQPGVACDPVSG 716 



Qy 



Db 



209 ECR— CPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQ 2 66 

I M MM I M I I I I I 

717 ECRTQCPPGYQGEDCGQECPVGTFGVNCSGSCSCV-GAPCHRVTGECLCPPGKTGEDCGA 775 



QY 



Db 



267 PCPEGRFGKNCSQEC-QCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQ 325 

MM M I : I I : I : I : M I I M : I I I I I I I I I I I 
77 6 DCPEGRWGLGCQEICPACEHGASCNPETGTCLCLPGFVGSRCQDTCSAGWYGTGCQIRCA 8 35 



QY 



Db 



32 6 CVNGG KCYH VSGACLC 341 

Ml II I I I I I I 

836 CANDGHCDPTTGRCSCAPGWTGLSCQRACDSGHWGPDCIHPCNCSAGHGNCDAVSGLCLC 8 95 



Qy 



342 EAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCS 4 01 



Ill: I III : I :l II I h =1 HI I I M I : I I 

Db 896 EAGYEGPRCE-QSCRQGYYGPSCEQKCRC — EHGAACDHVSGACTCPAGWRGSFCEHACP 952 

Qy 402 PGFYG EA CQQICSCQNG 418 

M : | : I I I I I : I I I 

Db 953 AGFFGLDCDSACNCSAGAPCDAVTGSCICPAGRWGPRCAQSCPPLTFGLNCSQICTCFNG 1012 

Qy 419 ADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWH 478 

I | M M I : I I I I I : I I II I M II I I : i I I : I I I I I I 
Db 1013 ASCDSVTGQCHCAPGWMGPTCLQACPPGLYGKNCQHSCLCRNGGRCDPILGQCTCPEGWT 1072 

Qy 479 GVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYGLNC 538 

|:| I I : I I I I I I : I i I : I I I I I I I : I I : I I I : I : : I 

Db 1073 GLACENECLPGHYAAGCQLNCSCLHGGICDRLTGHCLCPAGWTGDKCQSSCVSGTFGVHC 1132 

Qy 539 AERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDG 598 

Ml II II I I III I Ih I I : I Is M HI I 

Db 1133 EEHCACRKGASCHHVT GACFCPPGWRGPHCEQACPRGWFGEACAQRCLCPTNASCHHVTG 1192 

Qy 599 ICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNE 658 

I I I I I I : I : : I I I : I I I I I = I = I : I I I : I I : 

Db 1193 ECRCPPGFTGLSCEQACQPGTFGKDCEHLC-QCPGETWACDPASGVCTCAAGYHGTGCLQ 1251 

Qy 659 VCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNC 718 

Mill: : I =11 I " I = : | | : I I I I 

D b 1252 RCPSGRYGPGCEHICKCLNGGTCDPATGACYCPAGFLGADCSLACPQGRFGPSCAHVCAC 1311 

Qy 719 HNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRT 778 

M I I I I : I I I I : I II : II I I I I : I I I : I I : I 

Db 1312 RQGAACDPVSGACICSPGKTGVRCEHGCPQDRFGKGCELKCACRNGGLCHATNGSCSCPL 1371 

Qy 77 9 GFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGARCDQA 8 31 

|:||||| I I : I I I I I I I I : I : I I I I I I : I I : : 
Db 1372 GWMGPHCEHACPAGRYGAACLLECFCQNNGSCEPTTGACLCGPGFYGQACEHS 1424 



RESULT 12 


Q9TVQ2 


ID 


Q9TVQ2 PRELIMINARY; PRT; 1664 AA. 


AC 


Q9TVQ2; 


DT 


01-MAY-2000 (TrEMBLrel. 13, Created) 


DT 


01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 


DT 


01-OCT-2003 (TrEMBLrel . 25, Last annotation update) 


DE 


Y64G10A.7 protein. 


GN 


Y64G10A. 7 . 


OS 


Caenorhabditis elegans . 


OC 


Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea ; 


OC 


Rhabditidae; Peloderinae; Caenorhabditis . 


OX 


NCBI TaxID=6239; 


RN 


[1] 


RP 


SEQUENCE FROM N.A. 


RA 


Mortimore B.J. ; 


RL 


Submitted (APR-1999) to the EMBL/ GenBank/DDBJ databases. 


RN 


[2] 


RP 


SEQUENCE FROM N.A. 


RX 


MEDLINE=99 0 69 613; PubMed=9851916 ; 


RA 


none / 



RT 


"Genome sequence of the nematode C.elegans: A platform for 


RT 


investigating* biology. " ; 




RL 


Q^-i onpo 9P9 '9(11 9- 9018MQQFU 
bcisnce z o z . z u i <i z. u _l o \ ±z> z> o ) . 




RN 


r q 'l 

[3J 




RP 






RA 


Ains cough R. ; 




RL 




FMRL/GenRank/DDB J databases. 


DR 


EMBL; AL117206; CAB60454.1; 




DR 


EMBL; AL110498; CAB60454.1; 


JOINED. 


DR 


EMBL; AL110498; CAB57911.1; 




DR 


EMBL; AL117206; CAB57911.1; 


JOINED. 


DR 


HSSP; P00736; 1APQ. 




DR 


WormPep; Y64G10A.7; CE24549. 




DR 


GO; GO: 0005509; F : calcium ion binding; IEA. 


DR 


GO; GO:0005198; F:structural 


molecule activity; IEA. 


DR 


InterPro; IPR000152; Asx_hydroxyl_S . 


DR 


InterPro; IPR001881; EGF_Ca . 




DR 


InterPro; IPR006209; EGF_like. 


DR 


InterPro; IPR002049; Laminin_EGF. 


DR 


Pfam; PF00008; EGF; 22. 




DR 


PRINTS; PR00011; EGFLAMININ. 




DR 


SMART; SM0 017 9; EGF_CA; 4. 




DR 


PROSITE; PS0 0 010; ASX__HYDROXYL ; 4. 


DR 


PROSITE; PS00022; EGF_1 ; 22. 




DR 


PROSITE; PS01186; EGF_2 ; 24. 




DR 


PROSITE; PS01187; EGF_CA; 3. 




KW 


EGF-like domain. 




SQ 


SEQUENCE 1664 AA; 179279 


MW; A69F093B4C705832 CRC64; 



Query Match 28.6%; Score 1931; DB 5; Length 1664; 

Best Local Similarity 39.2%; Pred. No. 4.1e-145; 

Matches 330; Conservative 85; Mismatches 311; Indels 116; Gaps 



Qy 


95 


CP-GFYESGEMCVPHCADKCVHGRCIAP— NTCQCEPGWGGTNCSSACDGDHWGPHCTSRC 


152 


Db 


664 


I I M 1 1 i 1 1 : 1 1 1 1 1 1 1 : 1 1 : 1 : 1 1 
CPDGFY — GSQCNLKCRMDCPNGRCDPVFGYCTCPDGLYGQSCEKPCPHFTFGKNCRFPC 


721 


Qy 


153 


QC — KNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTGEC 


210 


Db 


722 


: | : | hlllll hi 1 : 1 1 : 1 1 : 1 1 1 1 I M 1 : 1 

KCARENSEGCDEITGKCRCKPGYYGHHCKRMCSPGLFGAGCAMKCSCPAGIRCDPVTGDC 


781 


Qy 


211 


— RCPPGYTGAFCEDLCPPGKHGPQCEQRCPC QNGGVCHHVTGECSCPSGWMGT 

: | | || | I : III 1 llhll 1 1 1 1 1 1 1 1 : 1 1 1 

TKKCPAGYQGNLCDQPCPAGYFGYDCEQKCSCADVASPHKSKVCHHVTGTCTCLPGKTGP 


262 


Db 


782 


841 


Qy 


263 


VCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAE 

: | | | : I 1 1 : MM 1 1 : 1 1 1 1 : 1 1 : 1 1 : 1 1 1 : 1 : 1 : 

LCDQSCAPNTYGPNCAHTCSCVNGAKCDESDGSCHCTPGFYGATCSEVCPTGRFGIDCMQ 


322 


Db 


842 


901 


Qy 


323 


TCQCVNGGKCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMS 
| : | M 1 : 1 : 1 1 1 : : 1 : : 1 : : 1 1 : 1 1 hi 1 • Ml 


382 


Db 


902 


LCKCQNGAICDTSNGSCECAPGWSGKKCD-KACAPGTFGKDCSKKCDC-ADGMH-CDPSD 


958 


Qy 


383 


GECACKPGWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGFK 

| | | | I I 1 hill 1 : 1 1 : 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 h = 
GECICPPGKKGHKCDETCDSGLFGAGCKGICSCQNGATCDSVTGSCECRPGWRGKKCDRP 


435 


Db 


959 


101 



Qy 



436 GIDCSTPCPLGTYG 449 

I II I 1 I I I : I 

Db 1019 CPDGRFGEGCNAICDCTTTNDTSMYNPFVARCDHVTGECRCPAGWTGPDCQTSCPLGRHG 1078 



Qy 450 INCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNT 509 

| | | | | | | I I I : I : I : I I I I I I I I I :: I I I I 

Db 1079 EGCRHSCQCSNGASCDRVTGFCDCPSGFMGKNCESECPEGLWGSNCMKHCLCMHGGECNK 1138 

Qy 510 LDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCD 569 

: | | | Ml || I I : I I I I : I I : I : I I I I I I II i I I I I : 
D b 1139 ENGDCECIDGWTGPSCEFLCPFGQFGRNCAQRCNCKNGASCDRKTGRCECLPGWSGEHCE 1198 

Qy 570 SVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTCP 629 

| | : | | | I : I I I I I llllhll III I : : I I I I : I 
Db 1199 KSCVSGHYGAKCEETCECENGALCDPISGHCSCQPGWRGKKCNRPCLKGYFGRHCSQSC- 1257 

Qy 630 QCVHSSGPCHHITGLCDCLPGFTGALCNEVCP 661 

: ! : I 111:111 I : I 11 = 11 
Db 1258 RCANSKS-CDHISGRCQCPKGYAGHSCTELCPDGTFGESCSQKCDCGENSMCDAISGKCF 1316 

Qy 662 SGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPA 706 

I I M : I : I : I I I I : I I I I I : I I : I I 

Db 1317 CKPGHSGSDCKSGCVQGRFGPDCNQLCSCENGGVCDSSTGSCVCPPGYIGTKCEIACQSD 1376 

Qy 7 07 HWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGAD 7 66 

: M | Mill I I : I : I I I : I I : I I I I I = I I hi I I 

Db 1377 RFGPTCEKICNCENGGTCDRLTGQCRCLPGFTGMTCNQVCPEGRFGAGCKEKCRCANG-H 1435 

Qy 7 67 CDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNSTCDHITGTCYCSPGWKGA 826 

| : | | : | | Ml I II I I I I I I I hi : II : I I I I I h 
Db 1436 CNASSGECKCNLGFTGPSCEQSCPSGKYGLNCTLDCECYGQARCDPVQGCCDCPPGRYGS 14 95 

Qy 827 RC 828 

I I 

Db 1496 RC 1497 



RESULT 13 




Q9W0A0 




ID 


Q9W0A0 PRELIMINARY; PRT; 8 81 AA. 




AC 


Q9W0AO; 




DT 


01-MAY-2000 (TrEMBLrel. 13, Created) 




DT 


01-JUN-2003 (TrEMBLrel. 24, Last sequence update) 




DT 


01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 




DE 


CG2086-PB. 




GN 


DRPR OR CG2086. 




OS 


Drosophila melanogaster (Fruit fly) . 




OC 


Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 




OC 


Neoptera; Endopterygota / Diptera; Brachycera; Muscomorpha; 




OC 


Ephydroidea; Drosophilidae ; Drosophila . 




OX 


NCBI TaxID=7227; 




RN 


[1] 




RP 


SEQUENCE FROM N . A. 




RX 


MEDLINE=2 0196006; PubMed=l 0731 132 ; 




RA 


Adams M.D., Celniker S.E., Holt R.A. , Evans C.A., Gocayne J.D. 


r 


RA 


Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R. 


F., 


RA 


George R.A., Lewis S.E., Richards S., Ashburner M. , Henderson 


S.N. , 



RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. , Chen L.X., 

RA Brandon R.C, Rogers Y.H., Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Gabor G.L., 

RA Abril J.F., Agbayani A., An H.J., Andrews-Pf annkoch C, Baldwin D., 

RA Ballew R.M. , Basu A., Baxendale J. , Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J . , Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E . , Center A. , Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P . , 

RA Durbin K.J., Evangelista CC, Ferraz C, Fernera S., Fleischmann W., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K . , 

RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D . , Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A. r Howland T.J., Wei M.H., Ibegwam C, 

RA Jalali M. , Kalush F. , Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A. , 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y . , Levitsky A. A., Li J., Li Z., Liang Y . r Lin X., 

RA Liu X., Mattei B. , Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J . , Moshrefi A. , 

RA Mount S.M., Moy M. , Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A. , Nixon K. , Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J . , Puri V., Reese M.G., 

RA Reinert K. , Remington K. , Saunders R.D., Scheeler F. , Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T., 

RA Spier E . r Spradling A.C, Stapleton M., Strong R. , Sun E., 

RA Svirskas R. , Tector C, Turner R . r Venter E . , Wang A.H., Wang X., 

RA Wang Z.Y., Wassarman D.A. , Weinstock G.M., Weissenbach J., 

RA Williams S.M., WoodageT, Worley K.C., Wu D . , Yang S., Yao Q.A. f Ye J., 

RA Yeh R.F., Zaveri J.S., Zhan M. , Zhang G. , Zhao Q., Zheng L . , 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M. , Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 2 87:2185-2195(2000). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Misra S., Crosby M.A. , Matthews B.B., Bayraktaroglu L . f Campbell K. , 

RA Hradecky P., Huang Y., Kaminker J.S., Prochnik S.E., Smith CD., 

RA Tupy J.L., Bergman CM., Berman B.P., Carlson J.W., Celniker S.E., 

RA Clamp M.E., Drysdale R.A. , Emmert D. , Frise E . , de Grey A.D.N. J., 

RA Harris N.L., Kronmiller B . , Marshall B., Millburn G.H., Richter J . , 

RA Russo S., Searle S.M.J. , Smith E., Shu S., Smutniak F . , 

RA Whitfield E.J., Ashburner M. , Gelbart W.M., Rubin G.M. , Mungall C.J., 

RA Lewis S.E.; 

RT "Annotation of Drosophila melanogaster genome."; 

RL Submitted (MAR-2000) to the EMBL/ GenBank/ DDB J databases. 

RN [3] 

RP SEQUENCE FROM N.A. 
RA FlyBase; 

RL Submitted (SEP-2002) to the EMBL/ GenBank/DDBJ databases. 
RN [4] 

RP SEQUENCE FROM N.A. 
RA FlyBase; 

RL Submitted (JAN-2003) to the EMBL/ GenBank/ DDB J databases. 
DR EMBL; AE003472; AAF47553.2; 

DR GO; GO: 0005198; F: structural molecule activity; IEA. 



DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR003006; IgJMHC. 

DR InterPro; IPR002049; Laminin_EGF. 

DR Pfam; PF00008; EGF; 7. 

DR PRINTS; PR00011; EGFLAMININ. 

DR SMART; SM00181; EGF; 12. 

DR SMART ; SMO 018 0; EGFJLam; 1 1 . 

DR PROSITE; PS00022; EGF_1; 11. 

DR PROSITE; PS01186; EGF_2 ; 13. 

DR PROSITE; PS00290; IG_MHC; 1. 

SQ SEQUENCE 881 AA; 96380 MW; 52 196D164F52F5C1 CRC64; 

Query Match 27.2%; Score 1832.5; DB 5; Length 881; 

Best Local Similarity 34.0%; Pred. No. 1.4e-137; 

Matches 342; Conservative 117; Mismatches 386; Indels 161; Gaps 17; 

Qy 151 RCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTGEC 210 

: | I i I : I I : I I I I I : i I I I i : I : I : i : : I : I : I I I I I : I I I 

Db 2 QCDCLNNAVCEPFSGDCECAKGYTGARCADICPEGFFGANCSEKCRCENGGKCHHVSGEC 61 

Qy 211 RCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQPCPE 270 

: I I I : II I : I I I I I I I I : I I I II I I I MM I I I I I II 

Db 62 QCAPGFTGPLCDMRCPDGKHGAQCQQDCPCQNDGKCQPETGACMCNPGWTGDVCANKCPV 121 

Qy 271 GRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGG 330 

| : | | : | : | : I I I I I I I I II I I I I I I I : II I I : I I I I 
Db 122 GSYGPGCQESCECYKGAPCHHITGQCECPPGYRGERCFDECQLNTYGFNCSMTCDCANDA 181 

Qy 331 KCYHVSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPG 390 

| : | | : | I : I : I I : I | | : | : : I I : I : I I I I : I I I I 

Db 182 MCDRANGTCICNPGWTGAKCAERICEANKYGLDCNRTCECDMEHTDLCHPETGNCQCSIG 2 41 

Qy 391 WSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGI 450 

| | | | : || I : I : I : I I I I I I I I I I I :: I I I I I : I 

Db 242 WSSAQCTRPCTFLRYGPNCELTCNCKNGAKCSPVNGTCLCAP GWRGPTCEESCEPGTFGQ 301 

Qy 451 NCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTL 510 

: I : I I I : I I I I I I I I I I : I I : I I I I I I I I 

Db 302 DCALRCDCQNGAKCEPETGQCLCTAGWKNIKCDRPCDLNHFGQDCAKVCDCHNNAACNPQ 361 

Qy 511 DGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDC—SHADGCHPTTGHCRCLPGWSGVHC 568 

: I : I I I I II I I : I I I I : I : I I : : I I : : : I I I I I ill 

Db 362 NGSCTCAAGWTGERCERKCDTGKFGHDCAQKCQCDFNNSLACDATNGRCVCKQDW-GV-- 418 

Qy 569 DSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRCSQTC 628 

I I I : I I I I I I I : I : I I I I I I I I I : I 

Db 419 CRCLNNSSCDPDSGNCICSAGWTGADCAEPCPPGFYGMECKERC 462 

Qy 629 PQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSC 688 

I : : I : I I I I I I I : I I I I : I : I I I I : I I I : I 

Db 463 PEILHGNKSCDHITGEILCRTGYIGLTCEHPCPAGLYGPGCKLKCNCEHGGECNHVTGQC 522 

Qy 689 QCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGLYCTQRCPL 748 

li : I : : I I : I i I I : I I I I I I I ' I I : I I 

Db 523 QCLPGWTGSNCNESCPTDTYGQGCAQRCRCVHHKVCRKADGMCICETGWSGTRCDEVCPE 582 



Qy 749 GFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNS 808 

1111:1 ||: | : | ||:|: | :|:: I ::| I II 

Db 583 GFYGEHCMNTCACPSANFQCHAAHGCVCRSGYTGDNCDELIAS QRIADQSENS 635 

Qy 809 TCDHITGTCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALPADSYQIGAIAGIIILVLV 868 

I I I I I : : : : I 

Db 6 36 SRASVALT LVLMTLF 650 

Qy 869 VLFLLALFIIYRHKQKGKESSMPAVTYT PAMRWNADYTISGTLPHSNGGNANSHYFTNP 928 

: I : I I I I : : : : I I I 

Db 651 ACIIFAVFIYYRRRVSNLKTEIAHVHYT HDTNPPSWPPN HNFDNP 695 

Qy 92 9 SYHTLTQCATSPH VNNRDRMTVTKSKNNQLFVNLKNVNPGKRGPVGDC TG 978 

I : I : : | | | : : : II I 

Db 696 VYGMQAETRLLPNNMRSKMNNFDQRSTMSTDYGD DCNASGRVG 738 

Qy 979 TLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSNCSLSSSENPYATI 1033 

: : : | I i : : I I I : I : I I 

Db 739 SYSINYNHDLLTKNLNADRTNPIVYNESLKE EHVYDEIKHKEG 781 

Qy 1034 -KDP P VL I P KS S E C G YVEMK S P ARRD S P YAE I NN S T S AN RN VY EVE P T VS WQGV 1087 

Ml s| |: | ::: |: |: I I I I :|: I 

Db 782 YKDPVKIYSKILFPE-DEYDHLDYSRPSTSQKPHYHRMNDAMLNINQDEEKPSNVKNMTV 840 

Qy 1088 FSNNGRLSQDPYDLPKNSHIPCH YDLLPVRDS3 SSPK 1124 

I II II I : I I I I I 

Db 841 LLNK PLPPTEPEPQHECFDNTNTNLDNVSTASPSSSPK 878 



RESULT 14 




Q8T3A7 




ID 


Q8T3A7 PRELIMINARY; PRT; 1070 AA. 




AC 


Q8T3A7; 




DT 


01-JUN-2002 (TrEMBLrel. 21, Created) 




DT 


01-JUN-2002 (TrEMBLrel. 21, Last sequence update) 




DT 


01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 




DE 


Y47H9C.4b protein. 




GN 


Y47H9C.4 OR Y47H9C.4B. 




OS 


Caenorhabditis elegans . 




OC 


Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; 


Rhabditoidea; 


OC 


Rhabditidae; Peloderinae; Caenorhabditis . 




OX 


NCBI TaxID=6239; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RA 


Harris B.R. ; 




RL 


Submitted (OCT-1998) to the EMBL/ GenBank/DDBJ databases 




RN 


[2] 




RP 


SEQUENCE FROM N.A. 




RX 


MEDLINE=9 9069613; PubMed=9851916; 




RA 


none ; 




RT 


"Genome sequence of the nematode C. elegans: A platform 


for 


RT 


investigating biology."; 




RL 


Science 282:2012-2018(1998). 




DR 


EMBL; AL032657; CAD27614.1; -. 




DR 


WormPep; Y47H9C.4b; CE30361. 




DR 


GO; GO: 0005198; F: structural molecule activity; IEA. 




DR 


InterPro; IPR006209; EGF_like. 





DR InterPro; IPR002049; Laminin_EGF. 

DR Pfam; PF00008; EGF; 4. 

DR PRINTS; PR00011; EGFLAMININ. 

DR SMART ; SM0018 0; EGF_Lam; 7. 

DR PROSITE; PS00022; EGF_1; 15. 

DR PROSITE; PS01186; EGF_2 ; 11. 

KW EGF-like domain. 

SQ SEQUENCE 1070 AA; 114180 MW; 7 52 54 D0DD5 64 3AE5 CRC64; 

Query Match 27.0%; Score 1823.5; DB 5; Length 1070; 

Best Local Similarity 32.4%; Preci. No. 9.1e-137; 

Matches 362; Conservative 155; Mismatches 416; Indels 183; Gaps 34; 

Qy 21 GTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQI YYT SCTDI LNWFKCTRHR 73 

Ml : : I I : I : : I : : : I I : I I : I 
Db 35 GTTEP QGDHVCT VKTIVDDY — ELKKVIHTWYNDTEQCLNPLTGFQC 8 0 

Qy 7 4 VSYRTAYRHGEKTMYRRK SQCCPGFYESGE-MCVPHCADKCVHGRCIAPNTC 12 4 

I : i : I I : I : I I I I : I : : : I : I I I hill I 

Db 81 TVEKRGQKASYQRQLVKKEKYVKQCCDGYYQTKDHFCLPDCNPPCKKGKCIEPGKC 136 

Qy 125 QCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQ 184 

: I : I I : I I h I : I II I : I hill hi I I I : I h I I I I I 

Db 137 ECDPGYGGKYCAS SCSVGTWGLGCSKSCDCENGANCDPELGTCICTSGFQGERCEKPCPD 196 

Qy 18 5 GTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGG 24 4 

: I : I : I I I I I h I : I I h I I I : I I I I : h I I I I I 
Db 197 NKWGPNCVKSCPCQNGGKCNK-EGKCVCSDGWGGEFCLNKCEEGKFGAECKFECNCQNGA 255 

Qy 24 5 VCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTG 304 

I : | : | | | | : | : | | III I : I : I I I | | : : : | : | I h I I 
Db 256 TCDNTNGKCICKSGYHGALCENECSVGFFGSGCTQKCDCLNNQNCDSSSGECKCI-GWTG 314 

Qy 305 ERCQDECPVGTYGVLCAETCQCV NGGKCYHVSGACLCEAGFAGERCEARLCPEG 358 

: I | I : h I : I I : I : I I I h h I : h I I 

Db 315 KHCDIGCSRGRFGLQCKQNCTCPGLEFSDSNASCDAKTGQCQCESGYKGPKCDERKCDAE 37 4 

Qy 359 LYGIKCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCSPGFYGEAC-- QQICSCQ 416 

I I I I I I I I I I I : I I I I I h I I II I I I I : I 

Db 37 5 QYGADCSKTCTCVRENTLMCAPNTGFCRCKPGFYGDNCELACSKDSYGPNCEKQAMCDWN 4 34 

Qy 417 NGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAV-CSPVDGSCTCKA 475 

: : : I : I I I I I I I : I I I I I I I I I h : I I II I I I I 

Db 435 HASECNPETGSCVCKPGRTGKNCSEPCPLDFYGPNCAHQCQCNQRGVGCDGADGKCQCDR 494 

Qy 476 GWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAP GWRGEKCELPCQDGT YG 535 

Ml I I |: h I I hi I |: : I I I I I : I h : I : h I I 
Db 495 GWTGHRCEHHCPADTFGANCEKRCKCPKGIGCDPITGECTCPAGLQGANCDIGCPEGSYG 554 

Qy 536 LNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSP 595 

I II: I I I I I II : I | : I: : h : I : I I I I : I I I I 

Db 555 PGCKLHCKCVNGK-CDKETGECTCQPGFFGSDCSTTCSKGKYGESCELSCPCSD-ASCSK 612 

Qy 596 DDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHI TGLC-DCLPGF 651 

Ml | : | : I : I I : I I : I : II hi III 

Db 613 QTGKCLCPLGTKGVSCDQKCDPNTFGFLCQETV TPSPCASTDPKNGVCLSCPPGS 667 



Qy 652 TGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPN 711 

: I I I I : I : I I : I : I : I : I I I I I : I I I : I I : I 

Db 668 SGIHCEHNCPAGSYGDGCQQVCSCADGHGCDPTTGECICEPGYHGKTCSEKCPDGKYGYG 727 

Qy 712 CIHTC-NCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNG-ADCDH 7 69 

I I I : I : I : I I I I I I I : I I I : I I : I : I : I : 

Db 728 CALDCPKCASGSTCDHINGLCICPAGLEGALCTRPCSAGFWGNGCRQVCRCTSEYKQCNA 787 

Qy 770 ISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNST--CDHITGTCYCSPGWKGAR 827 

: I : I : I III I : : I III I : I I : I I : : : I I : I I I : I 
Db 788 QTGECSCPAGFQGDRCDKPCEDGYYGPDCIKKCKCQGTATSSCNRVSGACHCHPGFTGEF 847 

Qy 828 CDQAGVI IVGNLNSLSRTST ALP ADSYQIGAIAG 861 

I : : I I I I I I : I I 

Db 848 C HALCPESTFGLKCSKECPKDGCGDGYECDAAIGCCHVDQMSCGKAKQE 896 

Qy 862 IIILVLWLF LLALFI I YRHK-QKGKES SMPAVT YT PAMRV 901 

: I : : I I I : I I I I I : I I I I : I I I : : 
Db 8 97 FEALNGAGRSTGLTWFFVLLIVALCGGLGLIALF — YRNKYQKEKDPDMPTVSF 94 8 

Qy 902 VNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVN 961 

I I | | | | : : : | : | : : | : | 
Db 949 HKAPNNDEGREFQNPLY SRQSVFP DSDAFSSENNGNHQ 98 6 

Qy 962 LKNVNPGKRGPVGDCTGTLPADWK HGGYLNELGAFGLDRSYMGK SLK 1008 

III:: II III: II 

Db 987 GGPPNGLLTLEEEELENKKIHG RSAAGRGNNDYASLD 1023 

Qy 1009 DLGKNSEYNS SNCSLSSSENPYATIKDPPVLI PKSS 1044 

: : : I : : I I I I I I I I I : I : 

Db 1024 EVAGEGS S S SASASASRRENPYADI S S PDPVTQNSA 1059 



RESULT 15 




Q9XWD6 




ID 


Q9XWD6 PRELIMINARY; 


PRT; 1111 AA. 


AC 


Q9XWD6; 




DT 


01-NOV-1999 (TrEMBLrel. 12, 


Created) 


DT 


01-NOV-1999 (TrEMBLrel. 12, 


Last sequence update) 


DT 


01-OCT-2003 (TrEMBLrel. 25, 


Last annotation update) 


DE 


Y47H9C.4 protein (CED-1). 




GN 


Y47H9C.4 OR CED-1. 




OS 


Caenorhabditis elegans. 




OC 


Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabdi toidea ; 


OC 


Rhabditidae; Peloderinae; Caenorhabditis. 


OX 


NCBI TaxID=6239; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RA 


Harris B. ; 




RL 


Submitted (OCT-1998) to the 


EMBL/ GenBank/DDBJ databases. 


RN 


[2] 




RP 


SEQUENCE FROM N.A. 




RX 


MEDLINE=94150718 ; PubMed-7 90 63 9 8 ; 


RA 


Wilson R. , Ainscough R. , Anderson K., Baynes C, Berks M. , 


RA 


Bonfield J., Burton J., Connell M. , Copsey T., Cooper J . , Coulson A., 


RA 


Craxton M. , Dear S., Du Z., 


Durbin R. , Favello A., Fulton L., 


RA 


Gardner A., Green P., Hawkins T . , Hillier L., Jier M. , Johnston L., 



RA Jones M. , Kershaw J . , Kirsten J. , Laister N., Latreille P., 

RA Lightning J., Lloyd C, Mcmurray A. , Mortimore B., O'Callaghan M . , 

RA Parsons J., Percy C, Rifken L . , Roopra A., Saunders D., Shownkeen R. , 

RA Smaldon N . , Smith A., Sonnhammer E., Staden R. , Sulston J . , 

RA Thierry-Mieg J., Thomas K. , Vaudin M. , Vaughan K., Waterston R . , 

RA Watson A., Weinstock L . , Wilkinson-Sproat J., Wohldman P.; 

RT "2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 

RT elegans."; 

RL Nature 368:32-38(1994). 

RN [3] 

RP SEQUENCE FROM N . A. 

RX MEDLINE=2 1097 72 0; PubMed=l 1 1 632 3 9 ; 

RA Zhou Z., Hartwieg E . , Horvitz H.R.; 

RT "CED-1 is a Transmembrane Receptor that Mediates Cell Corpse 

RT Engulfment in C. elegans."; 

RL Cell 104: 43-56 (2001) . 

DR EMBL; AL032657; CAA21739.1; -. 

DR EMBL; AF332568; AAG60061.1; -- 

DR PIR; T26972; T26972. 

DR HSSP; P05106; 1 JV2 . 

DR WormPep; Y47H9C.4a; CE20264. 

DR GO; GO: 0005198; F: structural molecule activity; IEA. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006212; Fur in^repeat . 

DR InterPro; IPR002049; LamininJEGF. 

DR Pfam; PF00008; EGF; 4. 

DR PRINTS; PR00011; EGFLAMININ. 

DR SMART; SM0018 0; EGF__Lam; 6. 

DR SMART; SM00261; FU; 2. 

DR PROSITE; PS00022; EGF_1; 15. 

DR PROSITE; PS01186; EGF_2; 11. 

KW EGF-like domain; Laminin EGF-like domain. 

SQ SEQUENCE 1111 AA; 118803 MW; A39F374C008F9874 CRC64; 

Query Match 26.8%; Score 1805.5; DB 5; Length 1111; 

Best Local Similarity 31.8%; Pred. No. 2.6e-135; 

Matches 373; Conservative 162; Mismatches 423; Indels 215; Gaps 40; 

21 GTASPLNLEDPNVCSHWESYSVTVQESYPHPFDQIYYT SCTDILNWFKCTRHR 73 

III : : I I : I : : I : : : I I : I I : I 
35 GTTEP QGDHVCT VKTIVDDY — ELKKVIHTWYNDTEQCLNPLTGFQC 8 0 

74 VSYRTAYRHGEKTMYRRK SQCCPGFYESGE-MCVPHCADKCVHGRCIAPNTC 124 

I : I : I I : I : I I I I : I : : : I : I I I hill I 

81 TVEKRGQKAS YQRQLVKKEKYVKQCCDGYYQTKDHFCLPDCNPPCKKGKCIEPGKC 136 

12 5 QCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNPITGACHCAAGFRGWRCEDRCEQ 18 4 

: I : I I : I I h h I II h I hill hi I I I :||:| III I 

137 ECDPGYGGKYCASSCSVGTWGLGCSKSCDCENGANCDPELGTCICTSGFQGERCEKPCPD 19 6 

185 GTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGG 2 44 

: I : I : I I I I I I : h I I h I I I : I I I I : h I I I I I 

197 NKWGPNCVKSCPCQNGGKCNK-EGKCVCSDGWGGEFCLNKCEEGKFGAECKFECNCQNGA 2 55 

245 VCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTCDAATGQCHCSPGYTG 3 04 

I : h | | | h I : I I III h I : I I I I h : : I : I I h I I 
256 TCDNTNGKCICKSGYHGALCENECSVGFFGSGCTQKCDCLNNQNCDSSSGECKCI-GWTG 314 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



Qy 305 ERCQDECPVGTYGVLCAETCQCV NGGKCYHVSGACLCEAGFAGERCEARLCPEG 358 

: I I I : I : I : I I : I : I I I I : I : I : I : I | 

Db 315 KHCDIGCSRGRFGLQCKQN.CTCPGLEFSDSNASCDAKTGQCQCESGYKGPKCDERKCDAE 374 

Qy 359 LYGI KCDKRCPCHLENTHSCHPMSGECACKPGWSGLYCNETCS PGFYGEAC — QQICSCQ 416 

II I I I I III I I : 1 I I I I I : I I II II I I : | 
Db 375 QYGADCSKTCTCVRENTLMCAPNTGFCRCKPGFYGDNCELACSKDSYGPNCEKQAMCDWN 434 

Qy 417 NGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAV-CSPVDGSCTCKA 475 

: : : I : I I I I I I I : I I I I I I I I I I : : I I II I I I I 

Db 4 35 HASECNPETGSCVCKPGRTGKNCSEPCPLDFYGPNCAHQCQCNQRGVGCDGADGKCQCDR 4 94 

Qy 47 6 GWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACNTLDGTCTCAPGWRGEKCELPCQDGTYG 535 

III i I I : I : I I I : I I I : : I I II I : I I : : I : I : I I 
Db 495 GWTGHRCEHHCPADTFGANCEKRCKCPKGIGCDPITGECTCPAGLQGANCDIGCPEGSYG 554 

Qy 536 LNCAERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSP 595 

I II: I I I I I I I : I I : I : : 1 : : | : I | | | : | | | | 
Db 555 PGCKLHCKCVNGK-CDKETGECTCQPGFFGSDCSTTCSKGKYGESCELSCPCSD-ASCSK 612 

Qy 596 DDGICECAPGFRGTTCQRICSPGFYGHRCSQTCPQCVHSSGPCHHI TGLC-DCLPGF 651 

III I : I : I : | | : | | : | : | | | : | | | | 

Db 613 QTGKCLCPLGTKGVSCDQKCDPNTFGFLCQETV TPSPCASTDPKNGVCLSCPPGS 667 

Qy 652 TGALCNEVCPSGRFGKNCAGICTCTNNGTCNPIDRSCQCYPGWIGSDCSQPCPPAHWGPN 711 

: I I I I : I : I I : I : I : I : I I I I I : I I I : I I : I 

Db 668 SGIHCEHNCPAGSYGDGCQQVCSCADGHGCDPTTGECICEPGYHGKTCSEKCPDGKYGYG 727 

Qy 712 CIHTC-NCHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNG-ADCDH 769 

I I I : I : I : I I I I I I I : I I I : I I : I : I : I : 

Db 72 8 CALDCPKCASGSTCDHINGLCICPAGLEGALCTRPCSAGFWGNGCRQVCRCTSEYKQCNA 7 87 

Qy 770 ISGQCTCRTGFMGRHCEQKCPSGTYGYGCRQICDCLNNST--CDHITGTCYCSPGWKGAR 827 

: I : I : I III I : : I 111 I : I I : I I : : : I I : I I I : I 

Db 78 8 QTGECSCPAGFQGDRCDKPCEDGYYGPDCIKKCKCQGTATSSCNRVSGACHCHPGFTGEF 847 

Qy 82 8 C D Q AG VI I VGN LNSLSRTST ALP ADSYQIGAIAG 8 61 

I : : I I I I I I : I I 

Db 848 C HALCPESTFGLKCSKECPKDGCGDGYECDAAIGCCHVDQMSCGKAKQE 896 

Qy 8 62 IIILVLWLF LLALFIIYRHK-QKGKESSMPAVTYTPAMRV 901 

: I : : I I I : I I I I I : I I I I : III:: 

Db 8 97 FEALNGAGRSTGLTWFFVLLIVALCGGLGLIALF-- YRNKYQKEKDPDMPTVSF 94 8 

Qy 902 VNADYTISGTLPHSNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVN 961 

I I I I I I : : : I : I : : I : I 
Db 949 HKAPNNDEGREFQNPLY SRQSVFP DSDAFS SENNGNHQ 986 

Qy 962 LKNVNPGKRGPVGDCTGTLPADWKHGGYLNELGAFGLDRSYM GKSLKDLGKN — 1013 

Mil I : : I : I II 

Db 987 GGPPN— GLLTLEEEELENKKIHGRSAAGRGNNDY 1019 

Qy 1014 SEYNSSNCSLSSSE NPYATIKDPPVLI PKSSECGYVEMK SPA 1055 

I : I I : I I : I I : I I : : I : I I I I 

Db 1020 ASLDEVAGEGSSSSASASASRRGLNSSEQSRRP — LLEEHDEEEFDEPHENSISPAHAVT 1077 



3 



Qy 1056 RRDSPYAEINN STSANR NVY 1075 

: : I I I : I : : III: hi 

Db 1078 TSNHNENPYADI SSPDPVTQNSANKKRAQDNLY 1110 

! 

Search completed: March 26, 2004, 16:11:10 
Job time : 61.4809 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on : 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



March 26, 2004, 15:58:50 ; Search time 19.1541 Seconds 

(without alignments) 
3099.072 Million cell updates/sec 

US-10-092-390-2 
6744 

1 MVISLNSCLSFICLLLCHWI SSPKQEDSGGSSSNSSSSSE 1140 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



141681 



Database : 



SwissProt 42:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 



% 

Query 



No. 


Score 


Match 


Length 


DB 


ID 




Description 


1 


1037 


15. 


4 


2524 


1 


NOTC 


_XENLA 


P21783 


xenopus lae 


2 


1034.5 


15. 


3 


2556 


1 


NTCl" 


_HUMAN 


P46531 


homo sapien 


3 


1028 


15. 


2 


2531 


1 


NTCl" 


"mouse 


Q01705 


mus musculu 


4 


1024 


15. 


2 


2531 


1 


NTCl" 


_RAT 


Q07008 


rattus norv 


5 


1014.5 


15. 


0 


2471 


1 


NTC2" 


_HUMAN 


Q04721 


homo sapien 


6 


998 


14 . 


8 


2471 


1 


NTC2~ 


RAT 


Q9qw30 


rattus norv 


7 
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14. 


7 


2470 


1 
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_MOUSE 


035516 


mus musculu 


8 


987 
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6 


2437 


1 
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_BRARE 


P46530 


brachydanio 


9 
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14. 


5 
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1 


NOTC^ 


_DROME 


P07207 


drosophila 


10 


977.5 


14. 


5 


2318 


1 


NTC3^ 


_MOUSE 


Q61982 


mus musculu 


11 


974 


14. 


4 


2321 


1 


NTC3~ 


_HUMAN 


Q9um4 7 


homo sapien 


12 


969 . 5 


14. 


4 


2319 


1 


NTC3~ 


_RAT 


Q9rl72 


rattus norv 


13 


959.5 


14. 


2 


2003 


1 


NTC4~ 


_HUMAN 


Q99466 


homo sapien 


14 


954.5 


14. 


2 


1064 


1 


FBPl" 


_STRPU 


P10079 


s trongyloce 


15 


951.5 


14. 


1 


1964 


1 


NTC4~ 


MOUSE 


P31695 


mus musculu 


16 


916.5 


13. 


6 


4289 


1 


TENX" 


HUMAN 


P22105 


homo sapien 


17 


870.5 


12. 


9 


830 


1 


SREC^ 


_HUMAN 


Q14162 


homo sapien 



18 


832 . 5 


12 . 


3 


1213 


1 


JAG3_ 


BRARE 


Q90y54 


brachydanio 


19 


813 


12 . 


1 


833 


1 


SRC2 


MOUSE 


P59222 


rtius musculu 


20 


808 


12 . 


- 0 


870 


1 


SRC2~ 


_HUMAN 


Q96qp6 


homo sapien 


2 1 


789 


11 . 


7 


1238 


1 


JAG2~ 


HUMAN 


Q9y219 


homo sapien 


99 


775.5 


11 . 


5 


2139 


1 


CRB DROME 


P10040 


drosophila 


9 1 


77 5 


11 . 


5 


1242 


1 


JAG1 




Q90y57 


brachydanio 


Z, H 


7 69 


11 . 


4 


22 01 


1 


TENA 


HUMAN 


P24821 


homo sapien 


25 


7 68 


11 . 


4 


1247 


1 


JAG 2 


MOUSE 


Q9qye5 


mus musculu 




7 67.5 


11 . 


4 


1746 


1 


TENA 


PIG 


Q29116 


sus scrofa 


27 


757 


11 . 


, 2 


1218 


1 


JAGl 


HUMAN 


P78504 


homo sapien 


28 


745 


11 . 


, 0 


12 02 


1 


JAG 2 


RAT 


P97607 


rattus norv 


29 


7 4 4 


11 . 


, 0 


1219 


1 


Uno -L 


RAT 


Q63722 


rattus norv 


30 


739 


11 . 


, 0 


1218 


1 


U jr\\J ± 


MOTT^F 

rlU U O J_i 


Q9qxx0 


mus musculu 


3 1 


736.5 


10 . 


, 9 


3695 


1 


T MA ^ 




015230 


homo sapien 


19 

_> z. 


720.5 


10 . 


( 7 


1808 


1 


TFTvT A 




P10039 


gallus gall 


33 


717.5 


10 . 


, 6 


18 01 


1 


T MR? 


RAT 


P15800 


rattus norv 


1A 


7 1 6 


1 0 


, 6 


3106 


1 


T MA 9 


1YIVJ U O Hi 


Q60675 


mus musculu 


1 S 


7 0 6.5 


10 . 


, 5 


17 90 


1 


T MR 1 


JJr\Wl v lI_i 


P11046 


dros ophila 




7 D 4 s 


i n 


4 


3084 


1 


LMAl_ 


MOUSE 


P19137 


mus musculu 


37 
o / 


7 n o s 


i n 

J. \J « 


4 


1799 


1 


LMB2~ 


_M0USE 


Q61292 


mus musculu 


o o 




1 0 . 


3 


3075 


1 


LMAl~ 


JiUMAN 


P25391 


homo sapien 




6Q n 


1 0 


o 

. z. 


1 4 OR 


1 


SERR~ 


DROME 


P18168 


dros ophila 




68 7 S 

vJ O / . vJ 




• z. 


3110 


1 


LMA2~ 


_HUMAN 


P24043 


homo sapien 


41 


685.5 


10. 


.2 


1798 


1 


LMB2 


HUMAN 


P55268 


homo sapien 


42 


683.5 


10, 


, 1 


3718 


1 


LMA5~ 


MOUSE 


Q61001 


mus musculu 


43 


676.5 


10. 


. 0 


3672 


1 


LML2~ 


CAE EL 


Q21313 


caenorhabdi 


44 


667 


9, 


. 9 


1786 


1 


LMBl" 


_HUMAN 


P07942 


homo sapien 


45 


664 


9. 


. 8 


1786 


1 


LMBl" 


MOUSE 


P02469 


mus musculu 



ALIGNMENTS 



RESULT 1 
NOTC_XENLA 

ID N0TC__XENLA STANDARD; PRT; 2 52 4 AA. 

AC P21783; 

DT 01-MAY-1991 (Rel. 18, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Neurogenic locus notch protein homolog precursor (XOTCH protein) . 

GN XOTCH. 

OS Xenopus laevis (African clawed frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Amphibia; Batrachia; Anura; Mesobat rachia ; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus. 

OX NCBI_TaxID=8 355; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=90385285; PubMed=2402639 ; 

RA Coffman C, Harris W., Kintner C; 

RT "Xotch, the Xenopus homolog of Drosophila notch."; 

RL Science 249:1438-1441(1990). 

RN [2] 

RP REVISIONS TO 1759-1782. 

RA Kintner C . ; 

RL Submitted (JUN-1996) to the EMBL/GenBank/DDBJ databases. 



CC SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- DEVELOPMENTAL STAGE: Expressed almost uniformly in early embryos. 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 36 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 6 ANK repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

Cc 

DR EMBL; M33874; AAB02039.1; 

DR HSSP; P00740; 1EDM. 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR000152; Asx_hydroxyl_S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR001438; EGF_II . 

DR InterPro; IPR006209; EGF__like. 

DR InterPro; IPR002049; Laminin_EGF. 

DR InterPro; IPR008297; Notch. 

DR InterPro; IPR000800; Notch_dom. 

DR Pfam; PF00023; ank; 6. 

DR Pfam; PF00008; EGF; 36. 

DR Pfam; PF00066; notch; 3. 

DR PIRSF; PIRSF002279; Notch; 1. 

DR PRINTS; PR00010; EGFBLOOD . 

DR PRINTS; PR00011; EGFLAMININ. 

DR PRINTS; PR01452; NOTCH. 

DR SMART; SM0 024 8; ANK; 6. 

DR SMART; SM0 017 9; EGF_CA; 24. 

DR SMART; SM00004; NL; 2. 

DR PROSITE; PS502 97; ANK__REP_REGION ; 1. 

DR PROSITE; PS50088; ANK^REPEAT; 4. 

DR PROSITE; PS00010; ASX_HYDROXYL ; 23. 

DR PROSITE; PS00022; EGF_1 ; 34. 

DR PROSITE ; PS01186; EGF_2 ; 29. 
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Qy 83 GEKTMYRR KSQC CP-GFYESGEMCVPHCADKCVHGR 117 

II:: I : I I I I I I : : : I : : I I : 

Db 53 GERCQFPNPCTIKNQCMNFGTCEPVLQGNAIDFICHCPVGF--TDKVCLTPVDNACVNNP 110 



Qy 118 C IAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALCNP — IT 164 

I : I : I I I I I : I I II I I I I I I 

Db 111 CRNGGTCELLNSVTEYKCRCPPGWTGDSCQQA DPCASN-PCANGGKCLPFEIQ 162 

Qy 165 GACHCAAGFRGWRCE DRCEQ GTYGNDCHQR CQ 196 

I I I I I I :: I I I : I I I I 

Db 163 YICKCPPGFHGATCKQDINECSQNPCKNGGQCINEFGSYRCTCQNRFTGRNCDEPYVPCN 222 

Qy 197 CQNGATC DHVTGECRCPPGYTGAFCED LC 225 

I I i I I i : : I I I I : : I I I : I 
Db 223 PSPCLNGGTCRQTDDTSYDCTCLPGFSGQNCEENIDDCPSNNCRNGGTCVDGVNTYNCQC 282 

Qy 22 6 PPGKHGPQCEQ RC PCQNGGVCHHVTG — ECSCPSGWMGTVCGQ 2 66 

II I I : I I I I I I I I : I I I : I I I I : 

Db 283 PPDWTGQYCTEDVDECQLMPNACQNGGTCHNTYGGYNCVCVNGWTGEDCSENIDDCANAA 342 

Qy 267 PCPEGRFGKNC — SQEC QCHNGGTCDA — ATGQ — CHCSPG 301 

I I i I I I I I : I I I I : I I II 

Db 343 CHSGATCHDRVASFYCECPHGRTGLLCHLDNACISNPCNEGSNCDTNPVNGKAICTCPPG 402 

Qy 302 YTGERCQ DECPVGTYGVLCAETCQCVNGGKCYHVSGA — CLCEAGFAGERCEARLCP 356 

III I I I I : I | : | | : | : | : | | : : 

Db 403 YTGPACNNDVDECSLGAN PCEHGGRCTNTLGSFQCNCPQGYAGPRCEIDV — 452 

Qy 357 EGLYGI KCDKRC PCHLENTHSCHPMSGE--CACKPGWSGLYC 396 

I II : I : I II I I II : I ! I I 

Db 453 NECLSNPC — QNDSTCLDQIGEFQCICMPGYEGLYCETNIDECASNPCLHN 501 

Qy 397 NE TCSPGFYGEACQ QICS CQNGADC DSVTGK- 427 

II I I I I I I I : I : I I I I : I I : 

Db 502 GKCIDKINEFRCDCPTGFSGNLCQHDFDECTSTPCKNGAKCLDGPNSYTCQCTEGFTGRH 561 

Qy 428 CTCAPGFKGIDC STP 442 

1111:11 II 
Db 562 CEQDINECIPDPCHYGTCKDGIATFTCLCRPGYTGRLCDNDINECLSKPCLNGGQCTDRE 621 

Qy 443 CPLGTYGINCSSR CG CKNDAVCSPVDG-SCTCKAGWHGVDCS IR 485 

I I I I I : I i : : I II : I I I I I : I : I | : | 

Db 622 NGYICTCPKGTTGVNCETKIDDCASNLCDNGKCIDKIDGYECTCEPGYTGKLCNININEC 681 



Qy 



486 CPSGTWGFGC — 



- 495 



Db 682 DSNPCRNGGTCKDQINGFTCVCPDGYHDHMCLSEVNECNSNPCIHGACHDGVNGYKCDCE 741 

Qy 496 NLTCQ CLNGGACNTLDGT — CTCAPGWRGEKCEL PC-Q 530 

II: I : I I I I : I Nihil: | | 

Db 742 AGWSGSNCDINNNECESNPCMNGGTCKDMTGAYICTCKAGFSGPNCQTNINECSSNPCLN 801 

Qy 531 DGT YGLNC AERCD CSHADGCHPT TGHCRCLPGWS 5 64 

II III I : | : | : I I I I I I 

Db 8 02 HGTCIDDVAGYKCNCMLPYTGAICEAVLAPCAGSPCKNGGRCKESEDFETFSCECPPGWQ 8 61 

Qy 565 GVHCD SVCAEGRWGPNCSL PCYCKNGASC 593 

IN I I I I I : I I I I I I 

Db 8 62 GQTCEIDMNECVNRPCRNGATCQNTNGSYKCNCKPGYTGRNCEMDIDDCQPNPCHNGGSC 921 

Qy 594 SPDDG1 CECAPGFRGTTCQR ICSPGFYGHRC 624 

I III I I I I I I I: I I I I I I 

Db 922 S— DGINMFFCNCPAGFRGPKCEEDINECASNPCKNGANCTDCVNSYTCTCQPGFSGIHC 97 9 

Qy 625 SQTCPQCVHSS GPCHHITGL CDCLPGFTGALC WE 658 

II II II IN IIININ II 

Db 980 ESNTPDCTESSCFNGGTC— IDGINTFTCQCPPGFTGSYCQHDINECDSKPCLNGGTCQD 1037 

Qy 659 VCPSGRFGKNCAGI CTCTNNGTC NPIDRSCQCYPGWIGSDCSQP 702 

I I I I I I : III! I I I : I II I I I 

Db 1038 SYGTYKCTCPQGYTGLNCQNLVRWCDSSPCKNGGKCWQTNNFYR-CECKSGWTGVYCDVP 1096 

Qy 703 CPPA--HWGPNCIHTCNCHNGAFC--SAYDGECKCTPGWTGLYCTQRCPLGFYGKDC 755 

II I : : I II I : N I I : II I I : : : I 

Db 1097 SVSCEVAAKQQGVDTVHL — CRNSGMCVDTGNTHFCRCQAGYTG3 YCEEQV DEC 1148 

Qy 756 ALICQCQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

: IIININ:! II I : I : I : : 

Db 1149 S-PNPCQNGATCTDYLGGYSCECVAGYHGVNCSEEINECLSHPCQNGGTCIDLINTYKCS 1207 

Qy 789 CPSGTYGYGCRQICD CLNNSTC-DHITG-TCYCSPGWKGARCDQAG 832 

I I I I I I I III I I : I I I II: I II: 
Db 1208 CPRGTQGVHCEINVDDCTPFYDSFTLEPKCFNNGKCIDRVGGYNCICPPGFVGERCE 1264 

Qy 833 VIIVGNLNS-LSRTSTALPADS 853 

I : : I II III 
Db 1265 GDVNECLSN PCDS 1277 



RESULT 2 
NTC1 HUMAN 



ID 


NTC1 HUMAN STANDARD; PRT; 2556 AA. 




AC 


P46531; 




DT 


01-NOV-1995 (Rel. 32, Created) 




DT 


28-FEB-2003 (Rel. 41, Last sequence update) 




DT 


10-OCT-2003 (Rel. 42, Last annotation update) 




DE 


Neurogenic locus notch homolog protein 1 precursor 


(Notch 1) (hNl) 


DE 


(Translocation-associated notch protein TAN-1). 




GN 


NOTCH1 OR TAN1. 




OS 


Homo sapiens (Human) . 




OC 


Eukaryota; Metazoa ; Chordata; Craniata ; Vertebrata ; 


Euteleostomi ; 


OC 


Mammalia; Eutheria; Primates; Catarrhini; Hominidae 


; Homo . 



OX NCBI_TaxID-9 60 6; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

i RA Mann R.S., Blaurnueller CM., Zagouras P.; 

RT "Complete human notch 1 (hNl) cDNA sequence."; 

RL Submitted (SEP-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE OF 1-2444 FROM N.A. 

RX MEDLINE=91347 3 67; PubMed=l 8 3 1 6 92 ; 

RA Ellisen L.W., Bird J., West D.C., Soreng A.L., Reynolds T.C., 

RA Smith S.D., Sklar J.; 

RT "TAN-1, the human homolog of the Drosophila notch gene, is broken by 

RT chromosomal translocations in T lymphoblastic neoplasms. "; 

RL Cell 66:649-661(1991). 

RN [3] 

RP IDENTIFICATION OF LIGANDS. 

RX MEDLINE=9 918 07 65; PubMed=l 0 07 9256 ; 

RA Gray G.E., Mann R.S., Mitsiadis E . , Henrique D., Carcangiu M.-L., 

RA Banks A., Leiman J., Ward D., Ish-Horowitz D., Artavanis-Ts akonas S.; 

RT "Human ligands of the Notch receptor. "; 

RL Am. J. Pathol. 154:785-794(1999). 

RN [4] 

RP INTERACTION WITH DTX1 . 

RX MEDLINE=98250176; PubMed-95 9 02 94 ; 

RA Matsuno K. , Eastman D., Mitsiades T., Quinn A.M., Carcanciu M.L., 

RA Ordentlich P., Kadesch T., Artavanis-Tsakonas S.; 

RT "Human deltex is a conserved regulator of Notch signalling."; 

RL Nat. Genet. 19:74-78(1998). 

CC -!- FUNCTION: Functions as a receptor for membrane-bound ligands 
CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP- J kappa and activates genes of the enhancer of split locus . 

CC Affects the implementation of differentiation, proliferation and 

CC apoptotic programs. May be important for normal lymphocyte 

CC function. In altered form, may contribute to transformation or 

CC progression in some T-cell neoplasms. Involved in the maturation 

CC of both CD4 + and CD8+ cells in the thymus (By similarity) . 

CC -!- SUBUNIT: Heterodimer of a C-terminal fragment N(TM) and a N- 
: CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds (By similarity). Interacts with DTX1 and DTX2 . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 
CC proteolytical processing NICD is translocated to the nucleus (By 

CC similarity) . 

CC -!- TISSUE SPECIFICITY: In fetal tissues most abundant in spleen, 

CC brain stem and lung. Also present in most adult tissues where it 

CC is found mainly in lymphoid tissues. 

CC -!- PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-accessible form. Cleavage results in a C- 

CC terminal fragment N(TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF~alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT). This fragment is then 

CC cleaved by presenilin dependent gamma-secretase to release a 



CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane (By similarity) . 

CC -!- PTM: Phosphorylated (By similarity). 

CC DISEASE: N0TCH1 truncation is associated with T-cell acute 

CC lymphoblastic leukemia. 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 36 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 5 ANK repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to licens e@isb-s ib . ch ) . 

CC 

DR EMBL; AF308602; AAG33848.1; -. 

DR EMBL; M73980; AAA60614.1; -. 

DR HSSP; P00740; 1EDM. 

DR Genew; HGNC:7881; N0TCH1 . 

DR MIM; 190198; -. 

DR GO; GO: 0016021; C: integral to membrane; NAS . 

DR GO; GO: 0003793; F : defense/immunity protein activity; NAS. 

DR GO; GO: 0006955; P : immune response; NAS. 

DR InterPro; IPR002110; ANK . 

DR InterPro; IPR000152; Asx_hydroxyl__S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR001438; EGF__II. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002049; Laminin_EGF. 

DR InterPro; IPR008297; Notch. 

DR InterPro; IPR000800; Notch_dom. 

DR Pfam; PF00023; ank; 6. 

DR Pfam; PF00008; EGF; 35. 

DR Pfam; PF00066; notch; 3. 

DR PIRSF; PIRSF002279; Notch; 1. 

DR PRINTS; PR00010; EGFBLOOD . 

DR PRINTS; PR00011; EGFLAMININ . 

DR PRINTS; PR01452; NOTCH. 

DR SMART; SM00248; ANK; 6. 

DR SMART; SM0 017 9; EGF^CA; 23. 

DR SMART; SM00004; NL; 3. 

DR PROSITE; PS50297; ANK_REP_REGION; 1. 

DR PROSITE; PS50088; ANK_REPEAT ; 4. 

DR PROSITE; PS00010; ASX_HYDROXYL ; 20. 

DR PROSITE; PS00022; EGF_1 ; 34. 

DR PROSITE; PS01186; EGF_2 ; 26. 

DR PROSITE; PS50026; EGF_3 ; 36. 

DR PROSITE; PS01187; EGF_CA; 18. 

KW Receptor; Transcription regulation; Activator; Differentiation; 

KW Developmental protein; Repeat; ANK repeat; EGF-like domain; 

KW Transmembrane; Glycoprotein; Signal; Phosphorylation. 
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Query Match 15.3%; Score 1034.5; DB 1; Length 2556; 

Best Local Similarity 25.8%; Pred. No. 3.1e-54; 

Matches 316; Conservative 83; Mismatches 304; Indels 523; Gaps 73; 

Qy 94 CCPGFYESGEMCVPHCADKCVHGRC IAPNTCQCEPGWGGTNCSSACDGDH 143 

I I I II : I : : I : I : I : I I I I I : I I 
Db 8 9 CALGF— SGPLCLTPLDNACLTNPCRNGGTCDLLTLTEYKCRCPPGWSGKSCQQA 141 

Qy 144 WGPHCTSRCQCKNGALCNPITGA — CHCAAGFRGWRCE DRCEQG TYGNDCHQ- 193 

i I I I I I I : I 1 I III : I I : I I I 

Db 142 — DPCASN-PCANGGQCLPFEASYICHCPPSFHGPTCRQDVNECGQKPRLCRHGGTCHNE 198 

Qy 194 RC QCQNGATC DHVTGECRCPPGYTGAFCE 222 

II I I I I I I I I I I I I I : I I I I 

Db 199 VGSYRCVCRATHTGPNCERPYVPCSPSPCQNGGTCRPTGDVTHECACLPGFTGQNCEENI 258 

Qy 223 DLCPPG--KHGPQC EQRCP CQNGGVCHHVTG- 251 

III I : I 1 II I I I I I I I : I 

Db 259 DDCPGNNCKNGGACVDGVNTYNCPCPPEWTGQYCTEDVDECQLMPNACQNGGTCHNTHGG 318 

Qy 252 -ECSCPSGWMGTVCGQ PCPEGRFGKNC — SQEC — 281 

I I : I I I I : I I I I I I : I 

Db 319 YNCVCWGWTGEDCSENIDDCASAACFHGATCHDRVAS FYCECPHGRTGLLCHLNDACI S 378 

Qy 282 -QCHNGGTCDA— ATGQ--CHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCY 333 

I : I II I : II I I I I I I I I : I I : I I I 

Db 379 NPCNEGSNCDTNPVNGKAICTCPSGYTGPACSQDYDECSLGAN PCEHAGKCI 430 

Qy 334 HVSGA--CLCEAGFAGERCEARLCPEGLYGIKCDKRC PCHLENTHSCHPMSGE — CA 386 

: I : I I I : I I I I : i I I : I : I III 

Db 431 NTLGSFECQCLQGYTGPRCEIDV NECVSNPC — QNDATCLDQIGEFQCM 477 

Qy 387 CKPGWSGLYC NE TCSPGFYGEACQ QICS C 415 

I 1 I : I : : I II I I I I I I I : I 

Db 478 CMPGYEGVHCEVNTDECASSPCLHNGRCLDKINEFQCECPTGE'TGHLCQYDVDECASTPC 537 



Qy 



416 QNGADC 
: I I I I 



DSV-TGKCTCAPGFKG 436 
III I I I I : I 



Db 



53 8 KNGAKCLDGPNTYTCVCTEGYTGTHCEVDIDECDPDPCHYGSCKDGVATFTCLCRPGYTG 5 97 



Qy 437 IDCST PCPL GTYGINCS SRCGCKNDAVCS 465 

II III I I I I I : I : 

Db 598 HHCETNINECSSQPCRLRGTCQDPDNAYLCFCLKGTTGPNCEINLDDCASSPCDSGTCLD 657 

Qy 466 PVDG-SCTCKAGWHGVDCSIR CPSGTWGFGCNL TC 4 99 

: I I I I : I : I I : I I I I I I I 

Db 658 KIDGYECACEPGYTGSMCNSNIDECAGNPCHNGGTCEDGINGFTCRCPEGYHDPTCLSEV 717 

Qy 50 0 QCLNGGACNTLDG-TCTCAPGWRGEKCEL 52 7 

I : : I : : I : I I I I I I I I : : 
Db 718 NECNSNPCVHGACRDSLNGYKCDCDPGWSGTNCDINNNECESNPCVNGGTCKDMTSGIVC 777 

Qy 528 PCQDGTYGLNCAERCD CSHADGC-HPTTGH-CRCLPGWSGVHCDSV CAEG- 575 

I : : I I I I : I : I I : I I I : : I I : I II 

Db 77 8 TCREGFSGPNCQTNINECASNPCLNKGTCIDDVAGYKCNCLLPYTGATCEVVLAP CAPSP 837 

Qy 57 6 -RWGPNC SLPCYC KNGASCSPDDGICECAPGFRGTTCQRI CS 616 

III III 1:1 I I : I I : I I I 

Db 838 CRNGGECRQSEDYESFSCVCPTAGAKGQTCEVDINECVLSPCRHGASCQNTHGXYRCHCQ 897 

Qy 617 PGFYGHRCSQTCPQCVHSSGPCHH ITGLCDCLPGFTGALCNE 658 

I : I I I III: I I I I I I I I I II 

Db 898 AGYSGRNCETDIDDC--RPNPCHNGGSCTDGINTAFCDCLPGFRGTFCEED1NECASDPC 955 

Qy 65 9 VCPSGRFGKNCAG ICT CTNNGTCNPIDR SCQCYPG 693 

I I : I I : I II I I I I I : I : I I M 

Db 956 RNGANCTDCVDSYTCTCPAGFSGIHCENNTPDCTESSCFNGGTC — VDGINSFTCLCPPG 1013 

Qy 694 WIGSDC SQP CPPAHWGPNC IHTCN CHNGA 722 

: I I I I : I II : I II I : I I : III 

Db 1014 FTGSYCQHWNECDSRPCLLGGTCQDGRGLHRCTCPQGYTGPNCQNLVHWCDSSPCKNGG 1073 

Qy 723 FC SAYDGECKCTPGWTGLYCTQ 744 

I : I hi I II II I I 

Db 1074 KCWQTHTQY--RCECPSGWTGLYCDVPSVSCEVAAQRQGVDVARLCQHGGLCVDAGNTHH 1131 

Qy 745 -RCPLGFYGKDCALI CQ CQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

II I : I I : I II I I I I I : : I II I : I : I : : 

Db 1132 CRCQAGYTGSYCEDLVDECSPSPCQNGATCTDYLGGYSCKCVAGYHGVNCSEEIDECLSH 1191 

Qy 789 CPSGTYGYGCRQICD CLNNSTC-DHITG 815 

I I I I I I I I I I I I I : I 

Db 1192 PCQNGGTCLDLPNTYKCSCPRGTQGVHCEINVDDCNPPVDPVSRSPKCFNNGTCVDQVGG 1251 

Qy 816 -TCYCS PGWKGARCDQAGVT IVGNLN 8 40 

: I I I I : I I I : I : : I 

Db 1252 YSCTCPPGFVGERCE GDVN 1270 

RESULT 3 
NTC1_M0USE 

ID NTC1_M0USE STANDARD; PRT; 2531 AA. 

AC Q01705; Q06007; Q61905; Q99JC2; Q9QW58; Q9R0X7; 
DT 01-NOV-1995 (Rel. 32, Created) 
DT 01-FEB-1996 (Rel. 33, Last sequence update) 



DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Neurogenic locus notch homolog protein 1 precursor (Notch 1) (Motch A) 

DE (mT14) (p300) . 

GN N0TCH1 OR MOTCH . 

OS Mus musculus (Mouse). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID-10090; 

RN [1] 

RP SEQUENCE FROM N . A. (ISOFORM 1) . 

RC TISSUE=Embryo; 

RX MEDLINE=9319417 0; PubMed=8 4 4 9 4 8 9 ; 

RA Franco del Amo F. , Gendron-Maguire M. , Swiatek P.J., Jenkins N.A., 

RA Copeland N.G., Gridley T . ; 

RT "Cloning, analysis, and chromosomal localization of Notch-1, a mouse 

RT homolog of Drosophila Notch."; 

RL Genomics 15:259-264(1993). 

RN [2] 

RP SEQUENCE OF 731-1899 FROM N . A. (ISOFORM 2), AND DEVELOPMENTAL STAGE. 

RC STRAIN-CD-I; TI SSUE=Embryo ; 

RX MEDL I NE= 93050801; PubMed= 142664 4; 

RA Reaume A.G., Conlon R.A. , Zirngibl R. , Yamaguchi T.P., Rossant J. ; 

RT "Expression analysis of a Notch homologue in the mouse embryo."; 

RL Dev. Biol. 154:377-387(1992). 

RN [3] 

RP SEQUENCE OF 1551-1647 FROM N.A. (ISOFORM 1), AND DEVELOPMENTAL STAGE. 

RC TISSUE=Embryo; 

RX MEDLINE-93048835; PubMed=14253 52 ; 

RA Franco del Amo F. , Smith D.E., Swiatek P.J., Gendron-Maguire M. , 

RA Greenspan R.J., McMahon A. P., Gridley T.; 

RT "Expression pattern of Motch, a mouse homolog of Drosophila Notch, 

RT suggests an important role in early pos timplantation mouse 

RT development."; 

RL Development 115:737-744(1992). 

RN [4] 

RP SEQUENCE OF 1161-1547 FROM N.A. 

RC STRAIN=C57BL/6 X CBA; TISSUE=Embryo ; 

RX MEDLINE=93178563; PubMed=8 4 4 0332 ; 

RA Lardelli M. , Lendahl U. ; 

RT "Motch A and Motch B-two mouse Notch homologues coexpressed in a 

RT wide variety of tissues."; 

RL Exp. Cell Res. 204:364-372(1993). 

RN [5] 

RP SEQUENCE OF 1659-1673 FROM N.A. 

RX MEDLINE=99364499; PubMed=l 04 37 7 8 8 ; 

RA Lee J.S., Ishimoto A., Yanagawa S.I.; 

RT "Murine leukemia provirus-mediated activation of the Notchl gene leads 

RT to induction of HES-1 in a mouse T lymphoma cell line, DL-3."; 

RL FEBS Lett. 455:27 6-2 8 0(1999). 

RN [6] 

RP SEQUENCE OF 1950-2201 FROM N.A. 

RX MEDLINE=98029496; PubMed=9384 67 1 ; 

RA Messerle M. , Folio M. , Nehls M. , Eggert H., Boehm T . ; 

RT "Dynamic changes in gene expression during in vitro differentiation of 

RT mouse embryonic stem cells."; 

RL Cytokines Cell. Mol . Ther. 1:139-143(1995). 

RN [7] 



RP SEQUENCE OF 1655-1659, CLEAVAGE BY FURIN-LIKE CONVERTASE, AND 

RP MUTAGENESIS OF 1651 - ARG — ARG- 165 4. 

RX MEDLINE=98318619; PubMed=965314 8 ; 

RA Logeat F. , Bessia C, Brou C, LeBail 0., Jarriault S . , Seidah N.G., 

RA Israel A. ; 

RT "The Notchl receptor is cleaved cons titutively by a furin-like 

RT convertase. "; 

RL Proc. Natl. Acad. Sci. U.S.A. 95:8108-8112(1998). 

RN [8] 

RP PARTIAL SEQUENCE, AND POST-TRANS LATIONAL PROCESSING. 

RX MEDLINE=21523956; PubMed=11518718 ; 

RA Saxena M.T., Schroeter E.H., Murom J.S., Kopan R. ; 

RT "Murine notch homologs (Nl-4) undergo presenilin-dependent 

RT proteolysis."; 

RL J. Biol. Chem. 276:40268-40273(2001). 

RN [9] 

RP POST -TRANSNATIONAL PROCESSING. 

RX MEDLINE=2137 4 37 6; PubMed-114 5 9 94 1 ; 

RA Mizutani T . , Taniguchi Y. , Aoki T . , Hashimoto N., Hon jo T . ; 

RT "Conservation of the biochemical mechanisms of signal transduction 

RT among mammalian Notch family members."; 

RL Proc. Natl. Acad. Sci. U.S.A. 98:9026-9031(2001). 

RN [10] 

RP INTERACTION WITH DTXl AND DTX2 . 

RX MEDLINE=21123790; PubMed==l 122 6752 ; 

RA Kishi N., Tang Z., Maeda Y . , Hirai A., Mo R. , Ito M. , Suzuki S., 

RA Nakao K., Kinoshita T-, Kadesch T . , Hui C . -C . , Artavanis-Tsakonas S., 

RA Okano H . , Matsuno K. ; 

RT "Murine homologs of deltex define a novel gene family involved in 

RT vertebrate Notch signaling and neurogenesis."; 

RL Int. J. Dev. Neurosci. 19:21-35(2001). 

CC FUNCTION: Functions as a receptor for membrane-bound ligands 

CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP-J kappa and activates genes of the enhancer of split locus. 

CC Affects the implementation of differentiation, proliferation and 

CC apoptotic programs (By similarity) . May play an essential role in 

CC postimplantation development, probably in some aspect of cell 

CC specification and/or differentiation. May be involved in mesoderm 

CC development, somite formation and neurogenesis. Involved in the 

CC maturation of both CD4+ and CD8 + cells in the thymus. 

CC -!- SUBUNIT: Heterodimer of a C-terminal fragment N(TM) and a N- 

CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds. Interacts with DTXl and DTX2 . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 

CC proteolytical processing NICD is translocated to the nucleus. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=l; 

CC IsoId=Q01705-l; Sequence=Displayed; 

CC Name=2 ; 

CC IsoId=Q01705-2; Sequence=VSP_0014 02 , VSP_001403, VSP_001404; 

CC Note=No experimental confirmation available; 

CC -!- TISSUE SPECIFICITY: Highly expressed in the brain, lung and 
CC thymus. Expressed at lower levels in the spleen, bone-marrow, 

CC spinal cord, eyes, mammary gland, liver, intestine, skeletal 



CC muscle, kidney and heart. 

CC -!- DEVELOPMENTAL STAGE: First detected in the mesoderm at 7 . 5 dpc By 

CC 8,5 dpc highly expressed in presomitic mesoderm, mesenchyme and 

CC endothelial cells, while much lower levels are seen in the 

CC neuroepithelium. Between 9.5-10.5 dpc expressed at high levels in 

CC the neuroepithelium. At 13.5 dpc expressed in the surface 

CC ectoderm, eye and developing whisker follicles. 

CC -!- PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-accessible form. Cleavage results in a C- 

CC terminal fragment N(TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF-alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT) . This fragment is then 

CC cleaved by presenilin dependent gamma-secretase to release a 

CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane. 

CC -!- PTM: Phosphorylated. 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 36 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 5 ANK repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; Z11886; CAA77941.1; -. 

DR EMBL; L02613; AAK14898.1; 

DR EMBL; X68278; CAA48339.1; -. 

DR EMBL; AJ238029; CAB40733.1; ~. 

DR EMBL; X82562; CAA57909.1; 

DR PIR; A46019; A46019. 

DR PIR; B49175; B49175. 

DR HSSP; P00740; 1EDM. 

DR MGD; MGI: 97363; Notchl. 

DR GO; GO: 0005887; C:integral to plasma membrane; IC. 

DR GO; GO: 0005515; F:protein binding; IPI. 

DR GO; GO: 0030154; P:cell differentiation; IMP. 

DR GO; GO: 0007386; P : compartment specification; IMP. 

DR GO; GO: 0007219; P:N signaling pathway; IC. 

DR GO; GO: 0045944; P:positive regulation of transcription from P. . .; IDA. 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR000152; Asx_hydroxyl_S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR001438; EGF_II. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002049; Laminin_EGF. 

DR InterPro; IPR008297; Notch. 

DR InterPro; IPR000800; Notch_dom. 

DR Pfam; PF00023; ank; 7. 



DR Pfam; PF00008; EGF; 35. 

DR Pfam; PF00066; notch; 3. 

DR PIRSF; PIRSF002279; Notch; 1. 

DR PRINTS; PR00010; EGFBLOOD . 

DR PRINTS; PR00011; EGFLAMININ. 

DR PRINTS; PR01452; NOTCH. 

DR SMART; SM00248; ANK; 6. 

DR SMART; SM0 017 9; EGF_CA; 24. 

DR SMART; SM00004; NL; 2. 

DR PROSITE; PS50297; ANK_REP_REGION; 1. 

DR PROSITE; PS50088; ANK_REPEAT; 2. 

DR PROSITE; PS00010; ASX_HYDROXYL ; 22. 

DR PROSITE; PS00022; EGFJL; 34. 

DR PROSITE; PS0II86; EGF_2; 27. 

DR PROSITE; PS50026; EGF_3 ; 36. 

DR PROSITE; PS01187; EGF__CA; 21. 

KW Receptor; Transcription regulation; Activator; Differentiation; 

KW Developmental protein; Repeat; ANK repeat; EGF-like domain; 

KW Transmembrane; Glycoprotein; Signal; Phosphorylation; 

KW Alternative splicing. 

FT SIGNAL 1 18 POTENTIAL. 

FT CHAIN 19 2531 NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 1. 

FT CHAIN 1711 2531 NOTCH EXTRACELLULAR TRUNCATION. 

FT CHAIN 1744 2531 NOTCH INTRACELLULAR DOMAIN. 

FT DOMAIN 19 1725 EXTRACELLULAR (POTENTIAL) . 

Query Match 15. 2%; Score 1028; DB 1; Length 2531; 

Best Local Similarity 25.7%; Pred. No. 7.6e-54; 

Matches 314; Conservative 83; Mismatches 286; Indels 538; Gaps 73; 

Qy 8 6 TMYRRKSQCCPGFYESGEMCV PHCADKCVH-GRCI APNTCQCEPGWGGTNCS S- 137 

I : I : I I I : I I : I I : : I : I : I : : I : I I I : I I 

Db 121 TLTEYKCRCSPGW--SGKSCQQADPCASNPCANGGQCLPFESSYICRCPPGFHGPTCRQD 178 

Qy 138 ACDGDHWGPHC TSRCQCKNGALCNP 162 

I I I I I I I 11:11111 

Db 179 VNECSQNPGLCRHGGHCHNEIGSYRCACCATHTGPHCELPYVPCSPSPCQNGATCRPTGD 238 

Qy 163 ITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATC-DHV-TGECRCPPGYTGAF 220 

I II III II: : II I : I I I I I I I I 1 I I I I : 

Db 239 TTHECACLPGFAGQNCEENVD DCPGN-NCKNGGACVDGVNTYNCRCPPEVTGQY 2 91 

Qy 221 C-EDLCPPGKHGPQCEQRCP — CQNGGVCHHVTG— ECSCPSGWMGTVCGQ 266 

Ml: : I I I I I I I I I : I I I : I I I I : 

Db 292 CTEDV DEC-QLMPNACQNAGTCHNTHGGYNCVCVNGWTGEDCSENIDDCASAA 343 

Qy 267 PCPEGRFGKNC — SQEC QCHNGGTCDA — ATGQ — CHCSPG 301 

I I I I I I I hill I : I I I 

Db 344 CFQGATCHDRVASFYCECPHGRTGLLCHLKHACISNPCNEGSNCDTNPVNGKRICTCPSG 403 

Qy 302 YTGERCQ DECPVGTYGVLCAETCQCVNGGKCYHVSGA— CLCEAGFAGERCEARLCP 356 

| | I I I I I : I : | : | | | : | : I | 1 : I M : 

Db 404 YTGPACSQDVDECDLGAN RCEHAGKCLNTLGSFECQCLQGYTGPGCEIDV-- 453 

Qy 357 EGLYGIKCDKRC PCHLENTHSCHPMSGE— CACKPGWSGLYCNET 399 

I | | : 1 : | | | | | | I : I : I I 

Db 454 NECISNPC — QNDATCLDQIGEFQCICMPGYEGVYCEINTDECASSPCLHN 502 



Qy 4 00 CSPGFYGEACQ QICS CQNGADC— DSVTGKCTCAPGFKGID 438 

I I I I I I I : hill I | | | | : | 

Db 503 GHCMDKIHEFQCQCPKGFNGHLCQYDVDECASTPCKNGAKCLDGPNTYTCVCTEGYTGTH 562 

Qy 439 CST PCPLGT YGINCSSRCG CKNDAVCSPVD 468 

I III: I : I : I : : | | 

Db 563 CEVDIDECDPDPCHYGSCKDGVAT FTCLCQPGYTGHHCETNINECHSQPCRHGGTCQDRD 622 

Qy 4 69 GS--CTCKAGWHGVDCSIR CPSGTWGFGCNLTCQCLNGGACNTLDG-TCTCA 517 

11111:11 I I I I II : : I I I I 

Db 623 NSYLCLCLKGTTGPNCEINLDDCASNPCDSGT CL DKIDGYECACE 667 

Qy 518 PGWRGEKCEL PCQDGTYGLNCAERCDCSHADGCHPTT 554 

I I : I I : I : I I I I II : I I I 

Db 668 PGYTGSMCNVNIDECAGSPCHNGGTCEDGIAGFTC — RC PEGYHDPTCLSEVNECN 721 

Qy 555 GHCR CLPGWSGVHCD SVCAE 574 

III I I I I I I : I I || 

Db 722 SNPCIHGACRDGLNGYKCDCAPGWSGTNCDINNNECESNPCVNGGTCKDMTSGYVCTCRE 781 

Qy 575 GRWGPNC SLPCY CKNG 590 

I I I I I III | M 

Db 7 82 GFSGPNCQTNINECASNPCLNQGTCIDDVAGYKCNCPLPYTGATCEWLAPCATSPCKNS 8 41 

Qy 591 ASCSPDDGI CECAPGFRGTTCQ RICSPGFYG 621 

I : | | I : : I I I : : I I : I 

Db 842 GVCKESEDYESFSCVCPTGWQGQTCEVDINECVKSPCRHGASCQNTNGSYRCLCQAGYTG 901 

Qy 622 HRCSQTCPQCVHSSGPCHH ITGLCDCLPGFTGALCNE 658 

I I Ml: I I I I I I II II I I 

Db 902 RNCESDIDDC--RPNPCHNGGSCTDGINTAFCDCLPGFQGAFCEEDINECASNPCQNGAN 959 

Qy 659 VCPSGRFGKNCAG ICT CTNNGTCNPIDR SCQCYPGWIGSD 698 

Ml I : I II Mill : I : I I I I : I I 

Db 960 CTDCVDSYTCTCPVGFNGIHCENNTPDCTESSCFNGGTC--VDGINSFTCLCPPGFTGSY 1017 

Qy 699 C SQP CPPAHWGPNC IHTCN CHNGAFC 724 

I 1:1 I I : I I I : I : I I I I 

Db 1018 CQYDVNECDSRPCLHGGTCQDSYGTYKCTCPQGYTGLNCQNLVRWCDSAPCKNGGRCWQT 1077 

Qy 725 -SAYDGECKCTPGWTGLYC TQRCPLGFY--GKDCALICQ 760 

: I I : : 1 : I : II I : I I 

Db 1078 NTQY--HCECRSGWTGVNCDVLSVSCEVAAQKRGIDVTLLCQHGGLCVDEGDKPIYCHCQA 1135 

Qy 761 CQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

I I I I I I I :: I II I : | : | : : 
Db 1136 GYTGSYCEDEVDECSPNPCQNGATCTDYLGGFSCKCVAGYHGSNCSEEINECLSQPCQNG 1195 

Qy 789 CPSGTYGYGCRQICD CLNNSTC-DHITG-TCYC 819 

M I : I II I 

Db 1196 GTCIDLTNS YKCSCPRGTQGVHCEINVDDCHPPLDPASRSPKCFNNGTCVDQVGGYTCTC 1255 

Qy 82 0 S P GW KGARC DQAG VI I VGN LN 840 

I I : I I I : I : : I 

Db 1256 PPGFVGERCE GDVN 1269 



RESULT 4 
NTC1 RAT 



ID NTCl^RAT STANDARD; PRT; 2531 AA. 

AC Q07008; 

DT 01-NOV-1995 (Rel. 32, Created) 

DT 15-JUL-1999 (Rel. 38, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Neurogenic locus notch homolog protein 1 precursor (Notch 1) . 

GN N0TCH1. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-Schwann cell; 

RX MEDLINE=92111383; PubMed=17 64995 ; 

RA Weinmaster G., Roberts V.J., Lemke G.; 

RT "A homolog of Drosophila Notch expressed during mammalian 

RT development. 

RL Development 113:199-205(1991). 

RN [2] 

RP REVISIONS TO 1652-1653. 

RA Weinmaster G. ; 

RL Submitted (APR-1998) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP FUNCTION. 

RX MEDLINE=210 94 5 08; PubMed=11182 08 0 ; 

RA Tanigaki K., Nogaki F., Takahashi J., Tashiro K. , Kurooka H. , 

RA Hon jo T. ; 

RT "Notchl and Notch3 instructively restrict bFGF-responsive multipotent 

RT neural progenitor cells to an astroglial fate."; 

RL Neuron 2 9:45-55(2001). 

RN [4] 

RP TISSUE SPECIFICITY. 

RX MEDLINE=932 02 015; PubMed-12 9 57 45 ; 

RA Weinmaster G. , Roberts V.J., Lemke G.; 

RT "Notch2 : a second mammalian Notch gene." ; 

RL Development 116:931-941(1992). 

RN [5] 

RP TISSUE SPECIFICITY. 

RX MEDLINE=2 13317 89; PubMed= 11438922; 

RA Irvin D.K., Zurcher S.D., Nguyen T., Weinmaster G., Kornblum H.I.; 

RT "Expression patterns of Notchl, Notch2, and Notch3 suggest multiple 

RT functional roles for the Notch-DSL signaling system during brain 

RT development . 11 ; 

RL J. Comp. Neurol. 436:167-181(2 001). 

CC -!- FUNCTION: Functions as a receptor for membrane-bound ligands 
CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP-J kappa and activates genes of the enhancer of split locus. 

CC Affects the implementation of differentiation, proliferation and 

CC apoptotic programs (By similarity) . Acts instructively to control 

CC the cell fate determination of CNS multipotent progenitor cells, 

CC resulting in astroglial induction and neuron/oligodendrocyte 



CC suppression. 

CC -!- SUBUNIT: Heterodimer of a C-terminal fragment N (TM) and a N- 
CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds (By similarity) . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 

CC proteolytical processing NICD is translocated to the nucleus (By 

CC similarity) . 

CC -!- TISSUE SPECIFICITY: Expressed in the brain, kidney and spleen. 
CC Expressed in postnatal central nervous system (CNS) germinal zones 

CC and, in early postnatal life, within numerous cells throughout the 

CC CNS. Found in both subventricular and ventricular germinal zones. 

CC -!- DEVELOPMENTAL STAGE: In the embryo, highest levels occur between 
CC days 12 and 14 and decrease rapidly to much lower levels in the 

CC adult. 

CC -!- PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-acces sible form. Cleavage results in a C- 

CC terminal fragment N (TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF-alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT) . This fragment is then 

CC cleaved by presenilin dependent gamma-secretase to release a 

CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane (By similarity) . 

CC -!- PTM: Phosphorylated (By similarity). 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 36 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 Lin/Notch repeats. 

CC SIMILARITY: Contains 5 ANK repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch ) . 

CC 

DR EMBL; X57405; CAA40667.1; -. 

DR HSSP; P00740; 1EDM. 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR000152; Asx_hydroxyl_S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGFjCa . 

DR InterPro; IPR001438; EGFJEI. 

DR InterPro; IPR006209; EGF-like. 

DR InterPro; IPR002049; Laminin^EGF. 

DR InterPro; IPR008297; Notch. 

DR InterPro; IPR000800; Notch_dom. 

DR Pfam; PF00023; ank; 6. 

DR Pfam; PF00008; EGF; 35. 

DR Pfam; PF00066; notch; 3. 

DR PIRSF; PIRSF002279; Notch; 1. 

DR PRINTS; PR00010; EGFBLOOD. 

DR PRINTS; PR00011; EGFLAMININ. 

DR PRINTS; PR01452; NOTCH. 



DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 



SMART; SM00248; ANK; 6. 
SMART; SM0 017 9; EGF__CA; 25. 
SMART; SM00004; NL; 2. 
PROSITE; PS50297; ANK_REP_REGION ; 
PROSITE; PS50088; 
PROSITE; PS00010; 
PROSITE; PS00022; 
PROSITE; PS01186; 
PROSITE; PS50026; 
PROSITE; PS01187; 



1. 



ANK_REPEAT; 4. 
ASX^HYDROXYL; 22. 
EGF_1; 35. 
EGF_2; 26. 
EGF_3; 36. 
EGF_CA; 21. 

Receptor; Transcription regulation; Activator; Differentiation; 
Developmental protein; Repeat; ANK repeat; EGF-like domain; 
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Query Match 15.2%; Score 1024; DB 1; Length 2531; 

Best Local Similarity 25.8%; Pred. No. 1.3e-53; 

Matches 315; Conservative 83; Mismatches 286; Indels 536; 
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Qy 

Db 

Qy 

Db 



74 
137 



8 6 TMYRRKSQCCPGFYESGEMCV PHCADKCVH-GRCI APNTCQCEPGWGGTNCSS 

I : I : I I I : I I : I I : : | : | : | : : I I I i : I I 

121 TLTEYKCRCPPGW--SGKSCQQADPCASNPCANGGQCLPFESSYICGCPPGFHGPTCRQD 178 

138 ACDGDHWGPHC TSRCQCKNGALCNP 162 

i I I I I I I I I : I I I I 

17 9 VNECSQNPGLCRHGGTCHNEIGSYRCACRATHTGPHCELPYVPCSPSPCQNGGTCRPTGD 238 



Qy 

Db 

Qy 

Db 



163 ITGACHC7\AGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATC-DHV-TGECRCPPGYTGAF 22 0 

I II III II: : i I 1:11 I I I I I I I I I : I I : 

239 TTHECACLPGFAGQNCEENVD DCPGN-NCKNGGACVDGVNTYNCRCPPEWTGQY 291 

221 C-EDLCPPGKHGPQCEQRCP — CQNGGVCHHVTG — ECSCPSGWMGTVCGQ 2 66 

III: : I I I I I I I I I : I I I : I I I I 

292 CTEDV DEC-QLMPNACQNAGTCHNSHGGYNCVCVNGWTGEDCSDNIDDCASAA 343 



Qy 

Db 



267 PCPEGRFGKNC--SQEC QCHNGGTCDA--ATGQ--CHCS PG 301 

II 1 I I I : I I : I I I hill 

34 4 CFQGATCHDRVASFYCECPHGRTGLLCHLNDACISNPCNEGSNCDTNPVNGKAICTCPRG 4 03 



Qy 302 YTGERCQ DECPVGTYGVLCAETCQCVNGGKCYHVSGA— CLCEAGFAGERCEARLCP 356 

I I I I I I I : I I : I I I : I : I I I : I I I I : 

Db 404 YTGPACSQDVDECALGAN PCEHAGKCLNTLGSFECQCLQGYTGPRCEIDV — 453 

Qy 357 EGLYGIKCDKRC PCHLENTHSCHPMSGE — CACKPGWSGLYC 396 

i I I : I : I I I I I I i : I : I I 

Db 454 NECISNPC--QNDATCLDQIGEFQCICMPGYEGVYCEINTDECASSPCLHN 502 

Qy 397 NE TCSPGFYGEACQ QICS CQNGADC--DSVTGKCTCAPGFKGID 438 

II I I I I I I I : I : I I I I I I I I : I 

Db 503 GRCVDKINEFLCQCPKGFSGHLCQYDVDECASTPCKNGAKCLDGPNTYTCVCTEGYTGTH 5 62 

Qy 439 CST PCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGW 477 

I I I : I MM: : I I : I : 

Db 563 CEVDIDECDPDPCHIGL CK-DGVAT FTCLCQPGYTGHHCETNINECH 608 

Qy 47 8 HGVDCSIR CPSGTWGFGCNLTCQ CLNGGACNTLDG-TCTCAP 518 

I I I I I I I I I : I : I : : I I I I I 

Db 609 SQPCRHGGTCQDRDNYYLCLCLKGTTGPNCEINLDDCASNPCDSGTCLDKIDGYECACEP 668 

Qy 519 GWRGEKCEL PCQDGTYGLNCAERCDCSHADGCHPTT 554 

I : I I : I : I I I I I I Mil 

Db 669 GYTGSMCNVNI DECAGSPCHNGGTCEDGIAGFTC — RC PEGYHDPTCLSEVNECNS 722 

Qy 555 GHCR CLPGWSGVHCD SVCAEG 575 

III I I I I I I : I 1 III 

Db 72 3 NPCIHGACRDGLNGYKCDCAPGWSGTNCDINNNECESNPCVNGGTCKDMTSGYVCTCREG 7 82 

Qy 57 6 RWGPNC SLPCY CKNGA 591 

I I I I III I I I 

Db 783 FSGPNCQTNINECASNPCLNQGTCIDDVAGYKCNCPLPYTGATCEWLAPCATSPCKNSG 842 

Qy 592 SCSPDDGI CECAPGFRGTTCQ RICSPGFYGH 622 

I : I I I : : I I I : : I I : I 

Db 843 VCKESEDYESFSCVCPTGWQGQTCEIDINECVKSPCRHGASCQNTNGSYRCLCQAGYTGR 902 

Qy 623 RCSQTCPQCVHSSGPCHH ITGLCDCLPGFTGALCNE 658 

I I III: I I I I I I I II I I 

Db 903 NCESDIDDC--RPNPCHNGGSCTDGVNAAFCDCLPGFQGAFCEEDINECATNPCQNGANC 960 

Qy 659 VC P S GRFGKNCAG ICT CTNNGTCNPIDR SCQCYPGWIGSDC 699 

I I : I I : I II I I M I : I : I I M : M I 

Db 961 TDCVDSYTCTCPTGFNGIHCENNTPDCTESSCFNGGTC — VDGINS FTCLCPPGFTGSYC 1018 

Qy 700 SQP CPPAHWGPNC IHTCN CHNGAFC 724 

I : I I I : I I I : I : III! 

Db 1019 QYDVNECDSRPCLHGGTCQDSYGTYKCTCPQGYTGLNCQNLVRWCDSAPCKNGGKCWQTN 107 8 

Qy 725 SAYDGECKCTPGWTGLYC TQRCPLGFY — GKDCAL1CQ 760 

: ! IMIIIII : I : I 1 1 = 11 

Db 1079 TQY— HCECRSGWTGFNCDVLSVSCEVAAQKRGIDVTLLCQHGGLCVDEEDKHYCHCQAG 1136 

Qy 761 CQNGADC-DHISG-QCTCRTGFMGRHCEQK 788 

1 I I I I I I : : I I I I : I : I : : 

Db 1137 YTGSYCEDEVDECSPNPCQNGATCTDYLGGFSCKCVAGYHGSNCSEEINECLSQPCQNGG 1196 

Qy 789 CPSGTYGYGCRQICD CLNNSTC-DHITG-TCYCS 820 



I I I I I ! I 111111:1111 

Db 1197 TCIDLTNTYKCSCPRGTQGVHCEINVDDCHPPLDPASRSPKCFNNGTCVDQVGGYTCTCP 1256 

Qy 821 PGWKGARCDQAGVI I VGNLN 84 0 

I I : I I I : i : : I 

Db 1257 PGFVGERCE GDVN 1269 



RESULT 5 
NTC2_HUMAN 

ID NTC2_HUMAN STANDARD; PRT ; 2471 AA. 

AC Q04721; Q99734; Q9H240; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Neurogenic locus notch homolog protein 2 precursor (Notch 2) (hN2) . 

GN NOTCH2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=960 6; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC TISSUE=Brain; 

RA Blaumueller CM., Mann R.S.; 

RT "Complete human notch 2 (hN2) cDNA sequence."; 

RL Submitted (NOV-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N . A. 

RC TISSUE=Breast tumor; 

RA Correa R.G., Camargo A. A. , Moreira E.S., Simpson A.J.G.; 

RT "Human Notch2, a novel member of cell-fate determining NOTCH 

RT family."; 

RL Submitted (OCT-2000) to the EMBL/ GenBank/ DDB J databases. 

RN [3] 

RP SEQUENCE OF 967-1229 FROM N . A. 

RC TISSUE=T-cell; 

RA Lemasson I., Devaux C, Mesnard J.M.; 

RT "Partial sequence of EGF-like repeat domain of human Notch2 mRNA . " ; 

RL Submitted (NOV-1996) to the EMBL/ GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE OF 1810-2447 FROM N.A. 

RC TISSUE-Brain; 

RX MEDLINE=93265135; PubMed-13032 60 ; 

RA Stifani S., Blaumueller CM., Redhead N.J., Hill R.E., 

RA Artavanis-Tsakonas S.; 

RT "Human homologs of a Drosophila enhancer of split gene product define 

RT a novel family of nuclear proteins."; 

RL Nat. Genet. 2:119-127(1992). 

RN [5] 

RP POST-TRANS LAT I ONAL PROCESSING. 

RX MEDLINE-97386453; PubMed=92 4 4 3 02 ; 

RA Blaumueller CM., Qi H . , Zagouras P., Artavanis-Tsakonas S . ; 

RT "Intracellular cleavage of Notch leads to a heterodimeric receptor on 

RT the plasma membrane."; 

RL Cell 90:281-291(1997). 

RN [6] 



RP IDENTIFICATION OF LIGANDS . 

RX MEDLINE=9918 07 65; PubMed=l 0 07 92 5 6 ; 

RA Gray G.E., Mann R.S., Mitsiadis E., Henrique D., Carcangiu M.-L., 

RA Banks A. , Leiman J., Ward D. , Ish-Horowitz D., Artavanis-Tsakonas S.; 

RT "Human ligands of the Notch receptor. "; 

RL Am. J. Pathol. 154:785-794(1999). 

CC FUNCTION: Functions as a receptor for mernbrane-bound ligands 

CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP-J kappa and activates genes of the enhancer of split locus. 

CC Affects the implementation of differentiation, proliferation and 

CC apoptotic programs (By similarity) . 

CC -!- SUBUNIT: Heterodimer of a C-terminal fragment N (TM) and a N- 
CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds (By similarity) . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 
CC proteolytical processing NICD is translocated to the nucleus. 

CC -!- TISSUE SPECIFICITY: Expressed in the brain, heart, kidney, lung, 
CC skeletal muscle and liver. 

CC -!- PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-acces s ible form. Cleavage results in a C- 

CC terminal fragment N (TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF-alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT) . This fragment is then 

CC cleaved by presenilin dependent gamma-s ecretase to release a 

CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane (By similarity) . 

CC -!- PTM: Phosphorylated (By similarity). 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 35 EGF-like domains. 

CC -!- SIMILARITY: Contains 2 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 6 ANK repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch ) . 

CC 

DR EMBL; AF308601; AAA36377.2; -. 

DR EMBL; AF315356; AAG37073.1; -. 

DR EMBL; U77493; AAB19224.1; 

DR HSSP; P00740; 1EDM. 

DR Genew; HGNC:7882; N0TCH2 . 

DR MIM; 600275; -. 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR000152; Asx __hydroxyl_S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR001438; EGFJEI. 

DR InterPro; IPR006209; EGF like. 
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DR 


InterPro; 


IPR002 04 9; Laminin_ 


EGF. 






DR 


InterPro; 


IPR008297; Notch. 








DR 


InterPro; 


IPR0008 


00; Notch_dom. 






DR 


Pfam; PF00023; ank; 6. 








DR 


Pfam; PF00008; EGF; 35. 








DR 


Pfam; PF00066; notch; 2. 








DR 


PIRSF; PIRSF002279; Notch; 1. 








DR 


PRINTS; PR00010; 


EGFBLOOD. 








DR 


PRINTS; PR00011; 


EGFLAMININ . 








DR 


PRINTS; PR01452; 


NOTCH. 
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SiMART; SM00248; ANK; 6. 
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SMART; SM0 017 9; EGF_CA; 23. 








f)D 
U r\ 


SMART; SM00004; NL; 2. 








DR 


PROSITE; 


PS50297; 


ANK REP_REGION; 1. 
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28-FEB-2003 (Rel. 41, Created) 
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28-FEB-2003 (Rel. 41, Last sequence update) 
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28-FEB-2003 (Rel. 41, Last annotation update) 




DE 


Neurogenic locus notch hornolog protein 2 precursor 


(Notch 2) . 


GN 


NOTCH2 . 




OS 


Rattus norvegicus (Rat) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 


Euteleostomi ; 


OC 


Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae 


; Murinae; Rattus 


OX 


NCBI TaxID=10116; 




RN 
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RP 


SEQUENCE FROM N.A. 




RC 


TISSUE=Brain; 




RX 


MEDLINE-932 02 015; PubMed=12 957 4 5 ; 




RA 


Weinmaster G., Roberts V.J., Lemke G. ; 




RT 


"Notch2 : a second mammalian Notch gene."; 




RL 


Development 116:931-941(1992). 




RN 
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TISSUE SPECIFICITY. 





RX MEDLINE-21331789; PubMed-1 1 4 3 8 922 ; 

RA Irvin D.K., Zurcher S.D., Nguyen T., Weinmaster G. , Kornblum H.I.; 

RT "Expression patterns of Notchl, Notch2, and Notch3 suggest multiple 

RT functional roles for the Notch-DSL signaling system during brain 

RT development. 11 ; 

RL J. Comp. Neurol. 436:167-181(2001). 

CC -!- FUNCTION: Functions as a receptor for membrane-bound ligands 

CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP-J kappa and activates genes of the enhancer of split locus. 

CC Affects the implementation of di f f erentiation, proliferation and 

CC apoptotic programs. May play an essential role in pos timplantation 

CC development , probably in some aspect of cell specification and/or 

CC differentiation (By similarity) . 

CC -!- SUBUNIT: Heterodimer of a C-terrninal fragment N(TM) and a N- 

CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds (By similarity) . 

CC SUBCELLULAR LOCATION: Type I membrane protein. Following 

CC proteolytical processing NICD is translocated to the nucleus. 

CC -!- TISSUE SPECIFICITY: Highly expressed in the spleen and choroid 

CC plexus in the brain. Expressed in postnatal central nervous system 

CC (CNS) germinal zones and, in early postnatal life, within numerous 

CC cells throughout the CNS. It is more highly localized to 

CC ventricular germinal zones. Also found in the heart, liver and 

CC kidney. 

CC DEVELOPMENTAL STAGE: Expressed in the brain during E14 and E17. 

CC PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-acces sible form. Cleavage results in a C- 

CC terminal fragment N (TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF-alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT) . This fragment is then 

CC cleaved by presenilin dependent gamma- secret as e to release a 

CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane (By similarity) . 

CC -!- PTM: Phosphorylated (By similarity). 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 35 EGF-like domains. 

CC -!- SIMILARITY: Contains 2 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 6 ANK repeats. 
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IPR000742; 
IPR001881; 
IPR001438; 
IPR006209; 
IPR002049; 
IPR008297; 
IPR000800; 
Pfam; PF00023; ank; 6. 
Pfam; PF00008; EGF; 35. 
Pfam; PF00066; notch; 2. 
PIRSF; PIRSF002279; Notch; 1 
PRINTS; PR00010; EGFBLOOD . 
PRINTS; PR00011; EGFLAMININ. 
PRINTS; PR01452; NOTCH . 
SMART; SM00248; ANK; 6. 
SMART; SM00179; EGF CA; 24. 
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U J. J U XJ X X Xj 
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4 1 


RY STMTLAP.TTY 

J_> X kJ XI IX X/iAX X X • 






FT 
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JJl U U XJ X _L XJ 
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51 
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xj x kj xi ii. xj^xr\_L x x • 
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xj x kJ xrix xj \ j_ x x > 
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D 1 O Xi 11 XJxxlA X XX. 
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XJ X kJ XI IX Xj^\.r\X XX. 






FT 


DT SUT.FTD 


148 


159 
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XJ X k)ll 11 Xj-rxixX XX. 
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Query Match 14.8%; Score 998; DB 1; Length 2471; 

Best Local Similarity 24.4%; Pred. No. 4.6e-52; 

Matches 321; Conservative 79; Mismatches 322; Indels 596; Gaps 70 

Qy 17 CHWIGTASPLNLEDPNVCSHWESYSVTVQESY PHPFDQI YYTSCTDIL 64 

II: | : : | | | : | | : : : | : : 

Db 121 CHMLS WDTYECTCQVGFTGKQCQWTDVCLSHPCEN — GSTCSSVA 163 

Qy 65 NWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKCVHGRCIAPNTC 12 4 

III : I I : I : I : : I I I I I 

Db 164 NQFSC RCPAGI --TGQKCDADINECDIPGRCQHGGTC 198 

Qy 125 QCEPGWGGTNCSSACDGDHWGPHCTS RCQCKNGALC NPITGACHCAAGFRG 175 

II I I I I I I Mil I I I I II I 

Db 199 LNLP — GS YRCQ--CPQRFTGQHCDSPYVPCAPSPCVNGGTCRQTGDFTSECHCLPGFEG 254 

Qy 17 6 WRCEDRCEQGTYGNDCHQRCQCQNGATC-DHV-TGECRCPPGYTGAFC-EDLCPPGKHGP 2 32 

II : II : I I I I I I I I I i I I I : I I I I II : 
Db 255 SNCERNID DCPNH-KCQNGGVCVDGVNTYNCRCPPQWTGQFCTEDV D 300 

Qy 233 QC-EQRCPCQNGGVCHHVTG — ECSCPSGWMGTVCGQP 2 67 

: I I I I I I I I : I I I : I I I I : 
Db 301 ECLLQPNACQNGGTCTNRNGGYGCVCVNGWSGDDCSENIDDCAFASCTPGSTCIDRVASF 3 60 

Qy 268 CPEGRFGKNC — SQEC QCHNGGT CDA- -AT GQ- - CHCS PGYTGERCQ DECP 312 

1111:11 I Mill I I I I I I I III 

Db 361 SCLCPEGKAGLLCHLDDACISNPCHKGALCDTNPLNGQYICTCPQAYKGADCTEDVDECA 42 0 

Qy 313 VGT YGVLCAETCQCVNGGKCYHVS GA — CLCEAGFAGERCE 351 

: I : I : I I I : II II I : I I I II 

Db 421 M ANSNPCEHAGKCVNTDGAFHCECLKGYAGPRCEMDINECHSDPCQNDATCLD 473 

Qy 352 ARLCPEGLYGIKCDKR CP CHLE- 373 

I I I I : I : II I : : 

Db 474 KIGGFTCLCMPGFKGVHCELEVNECQSNPCVNNGQCVDKVNRFQCLCPPGFTGPVCQIDI 533 

Qy 374 NTHSC— HPMSGECACKPGWSGLYCNET 399 

I I II III I : : I I : I 
Db 534 DDCSSTPCLNGAKCIDHPNGYECQCATGFTGTLCDENIDNCDPDPCHHGQCQDGIDSYTC 593 

Qy 400 -CSPGFYGEAC-QQICSC QNGADCDSVTG-KCTCAPGFKGIDC STP 442 

I : I 1 : I I I I I : I I I I : I I I I I : : I II 

Db 594 ICNPGYMGAICSDQIDECYSSPCLNDGRCIDLVNGYQCNCQPGTSGLNCEINFDDCASNP 653 

Qy 443 CPLGTY--GIN CS SRCG CKNDAVC 464 

II III II II I : I I I 

Db 65 4 CLHGACVDGINRYSCVCSPGFTGQRCNIDIDECASNPCRKDATCINDVNGFRCMCPEGPH 713 



Qy 465 SP-VDGSCT CKAGWHGVDCSI 484 

I 1 : I : I I I I I I I : : I : 

Db 714 HPSCYSQVNECLSSPCIHGNCTGGLSGYKCLCDAGWVGINCEVDKNECLSNPCQNGGTCN 773 

Qy 485 RCPSGTWGFGCNLTCQ CLNGGAC 507 

I I I : I : Mill 
Db 774 NLVNGYRCTCKKGFKGYNCQVNIDECASNPCLNQGTCLDDVSGYTCHCMLPYTGKNCQTV 833 

Qy 508 NTLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHAD 54 8 

I I I I I I I I : I : : I : : I : I : 

Db 834 LAP C S P N P C ENAAVCKEAPN FE S FT C L CAP GWQ GQ RCT VD VD E CVSK-PCMNNG 8 86 

Qy 549 GCHPTTGH — CRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASC — SPDDGICECAP 604 

I I I I I I I I : I I : i : II I : I I I I : III 

Db 887 ICHNTQGSYMCECPPGFSGMDCEE DINDCLANPCQNGGSCVDKVNTFSCLCLP 939 

Qy 605 GFRGTTCQR 1 CSPGFYGHRCSQTCPQCVHSS 635 

Ml I I I I I : I I : I II 

Db 940 GFVGDKCQTDMNECLSEPCKNGGTCSDYVNSYTCTCPAGFHGVHCENNI DECTESSCFNG 999 

Qy 636 GPCHHITGL CDCLPGFTGALC NE VCPSGRFG 666 

I I : I : I I I I I I I II II I I 

Db 1000 GTC— VDGINSFSCLCPVGFTGPFCLHDINECSSNPCLNSGTCVDGLGTYRCTCPLGYTG 1057 

Qy 667 KNC AGICT CTNNGTC--NPIDRSCQCYPGWIGSDCSQ 701 

I I I : I : I I I I I I I I I I I : I 

Db 1058 KNCQTLVNLCS P S PCKNKGTCAQEKARPRCLCP PGWDGAYCDVLNVSCKAAALQKGVPVE 1117 

Qy 702 PCPPAHWGPMC IHTC NCHNGAFCSAYDG — ECKCTP 735 

II : I I : I | : | | | I : I N I 

Db 1118 HLCQHSGICINAGNTHHCQCPLGYTGSYCEEQLDECASNPCQHGATCSDFIGGYRCECVP 1177 

Qy 736 GWTGLYCTQR CPLGFYG KDCALICQCQN 7 63 

I : I : I I I I I I I I I I 

Db 1178 GYQGVNCEYEVDECQNQPCQNGGTCIDLVNHFKCSCPPGTRGLLCEENIDDCAGAPHCLN 1237 

Qy 764 GADC-DHI SG QCTCRTGFM 781 

I I I I : I 

Db 1238 GGQCVDRIGGYSCRCLPGFAGERCEGDINECLSNPCSSEGSLDCIQLKNNYQCVCRSAFT 1297 

Qy 7 82 GRHCE QKCPSGTYGYGCRQICDCLNNSTCDHITGT CYCSPGWKGARCDQA 831 

I I I I I II I 1 I I I : I I I I : I I I I : 

Db 1298 GRHCETFLDVCPQK PCLNGGTCAVASNVPDGFICRCPPGFSGARCQSS 1345 



RESULT 7 
NTC2_MOUSE 

ID NTC2_MOUSE STANDARD ; PRT; 24 7 0 AA. 

AC 035516; Q06008; Q60941; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Neurogenic locus notch homolog protein 2 precursor (Notch 2) (Motch 

DE B) . 

GN NOTCH2 . 

OS Mus mus cuius (Mouse) . 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [ 1 ] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6; TI SSUE=Thyrnus ; 

RA Hamada Y., Higuchi M. , Tsujimoto Y. ; 

RT "Complete amino acid sequence and mutliform transcripts encoded by a 

RT single copy of mouse Notch2 gene . " ; 

RL Submitted (JUL-1994) to the EMBL/GenBank/DDBJ databases . 

RN [2] 

RP SEQUENCE OF 316-1518 FROM N.A. 

RC STRAIN=C57BL/6 X CBA; TISSUE=Embryo ; 

RX MEDLINE=93 1785 63; PubMed=8 4 4 0332 ; 

RA Lardelli M. , Lendahl U.; 

RT "Motch A and Motch B-two mouse Notch homologues coexpressed in a 

RT wide variety of tissues. 1 *; 

RL Exp. Cell Res. 204:364-372(1993). 

RN [3] 

RP SEQUENCE OF 1765-2153 FROM N.A. 

RX MEDLINE=97075110; PubMed-8 9 17536 ; 

RA Milner L.A., Bigas A., Kopan R. , Brashem-Stein C, Bernstein I.D., 

RA Martin D. I . ; 

RT "Inhibition of granulocytic differentiation by mNotchl."; 

RL Proc. Natl. Acad. Sci . U.S.A. 93:13014-13019(1996). 

RN [4] 

RP FUNCTION. 

RX MEDLINE=993967 06; PubMed=103 9312 0 ; 

RA Hamada Y. , Kadokawa Y . , Okabe M. , Ikawa M . , Coleman J.R., 

RA Tsujimoto Y. ; 

RT "Mutation in ankyrin repeats of the mouse Notch2 gene induces early 

RT embryonic lethality."; 

RL Development 12 6:3415-3424(1999). 

RN [5] 

RP DEVELOPMENTAL STAGE, AND ALTERNATIVE SPLICING. 

RX MEDLINE=95333893; PubMed-7 60 9 614 ; 

RA Higuchi M. , Kiyama H., Hayakawa T., Hamada Y. , Tsujimoto Y. ; 

RT "Differential expression of Notchl and Notch2 in developing and adult 

RT mouse brain."; 

RL Brain Res. Mol . Brain Res. 2 9:263-272(1995). 

RN [6] 

RP POST-TRANS LATIONAL PROCESSING, AND MUTAGENESIS OF MET-1699. 

RX MEDLINE-21523956; PubMed=ll 5 1 8 7 1 8 ; 

RA Saxena M.T., Schroeter E.H., Mumm J.S., Kopan R. ; 

RT "Murine notch homologs (Nl-4) undergo presenilin-dependent 

RT proteolysis."; 

RL J. Biol. Chem. 276:40268-40273(2001). 

RN [7] 

RP POST-TRANS LATIONAL PROCESSING, AND MUTAGENESIS OF MET-1699. 

RX MEDLINE=21374376; PubMed=11459941 ; 

RA Mizutani T . , Taniguchi Y. , Aoki T., Hashimoto N . , Honjo T . ; 

RT "Conservation of the biochemical mechanisms of signal transduction 

RT among mammalian Notch family members."; 

RL Proc. Natl. Acad. Sci. U.S.A. 98:9026-9031(2001). 

CC -!- FUNCTION: Functions as a receptor for membrane-bound ligands 

CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 



CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP-J kappa and activates genes of the enhancer of split locus. 

CC Affects the implementation of differentiation, proliferation and 

CC apoptotic programs (By similarity) . May play an essential role in 

CC postimplantation development, probably in some aspect of cell 

CC specification and/or differentiation, 

CC SUBUN1T: Heterodimer of a C-terminal fragment N(TM) and a N- 

CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 

CC proteolytical processing NICD is translocated to the nucleus. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=l; 

CC Isold=035516-1 ; Sequence=Displayed; 

CC Name-2 ; 

CC IsoId=035516-2; Sequence=VSP_0O14O5; 

CC Note=No experimental confirmation available; 

CC -!- TISSUE SPECIFICITY: Expressed in the brain, liver, kidney, 

CC neuroepithelia, somites, optic vesicles and branchial arches, but 

CC not heart. 

CC -!- DEVELOPMENTAL STAGE: Expressed in the embryonic ventricular zone, 
CC the postnatal ependymal cells, and the choroid plexus throughout 

CC embryonic and postnatal development. 

CC -!- PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-acces sible form. Cleavage results in a C- 

CC terminal fragment N(TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF-alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT) . This fragment is then 

CC cleaved by presenilin dependent gamma-secretase to release a 

CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane. 

CC PTM: Phosphorylated. 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 35 EGF-like domains. 

CC -!- SIMILARITY: Contains 2 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 6 ANK repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



EMBL; D32210; BAA22094.1; 
EMBL; X68279; CAA48340.1; -. 
EMBL; U31881; AAC52924.1; 
PIR; A49175; A49175. 
HSSP; P16109; 1FSB. 
MGD; MGI : 97 3 64 ; Notch2 . 

GO; GO: 0005887; C: integral to plasma membrane; IC. 
GO; GO:0005515; F:protein binding; IPI. 



DR GO; GO: 0002011; P : morphogenesi s of an epithelial sheet; IMP . 

DR GO; GO: 0007219; P:N signaling pathway; IC. 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR000152; Asx_hydroxyl_S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR001438; EGF_II . 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002049; Laminin__EGF . 

DR InterPro; IPR008297; Notch. 

DR InterPro; IPR000800; Notch^dom. 

DR Pfam; PF00023; ank; 6. 

DR Pfam; PF00008; EGF; 34. 

DR Pfam; PF00066; notch; 2. 

DR PIRSF; PIRSF002279; Notch; 1. 

DR PRINTS; PR00010; EGFBLOOD . 

DR PRINTS; PR00011; EGFLAMININ . 

DR PRINTS; PR01452; NOTCH. 

DR SMART; SM00248; ANK; 6. 

DR SMART; SM0 017 9; EGF_CA; 23. 

DR SMART; SM00004; NL; 3. 

DR PROSITE; PS 5 02 97; ANK_REP_REGION ; 1. 

DR PROSITE; PS50088; ANK_REPEAT; 4. 

DR PROSITE; PS00010; ASX_HYDROXYL ; 22. 

DR PROSITE; PS00022; EGF_1 ; 33. 

DR PROSITE; PS01186; EGF_2 ; 27. 

DR PROSITE; PS50026; EGF__3; 35. 

DR PROSITE; PS01187; EGF_CA; 22. 

KW Receptor; Transcription regulation; Activator; Differentiation; 

KW Developmental protein; Repeat; ANK repeat; EGF-like domain; 

KW Transmembrane; Glycoprotein; Signal; Phosphorylation; 
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Query Match 14.7%; Score 993; DB 1; Length 2470; 

Best Local Similarity 25.8%; Pred. No. 9.1e-52; 

Matches 316; Conservative 87; Mismatches 295; Indels 52 6; Gaps 

Qy 93 QCCPGFYESGEMCVPHCADKCVHGR-CIAPNTCQ CEPGWGGTNCSSACDG 

: I I I I : I I I : I II II I : I : I I 

Db 91 RCAPGF--TGEDCQYSTSHPCFVSRPCQNGGTCHMLSRDTYECTCQVGFTGKQC 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



142 DHWGPHCTSRCQCKNGALCNPITG — ACHC7VAGFRGWRCE DRCEQ 

I II | : | | : | : : I I I I I : I I I I : 

14 3 -QWTDACLSH-PCENGSTCTSVASQFSCKCPAGLTGQKCEADINECDIPGRCQHGGTCLN 

18 5 --GTYGNDC HQRCQ CQNGATCDHV TGECRCPPGYTGAFCE D 

I : I I II I I I I I I I I I I 1 : I : I I I 

2 01 LPGSYRCQCGQGFTGQHCDSPYVRGLPCVNGGTCRQTGDFTLECNCLPGFEGSTCERNID 

22 4 LCPPGK--HGPQCEQ RCP CQNGGVCHHVTG-- 

I I I : I I III I I I I I I : I 

2 61 DCPNHKCQNGGVCVDGVNTYNCRCPPQWTGQFCTEDVDECLLQPNACQNGGTCTNRNGGY 

252 ECSCPSGWMGTVCGQP CPEGRFGKNC — SQEC 

I I : I I I I : Nihil I 

321 GCVCVNGWSGDDCSENIDDCAYASCTPGSTCI DRVASFSCLCPEGKAGLLCHLDDACI SN 

2 82 QCHNGGTCDA--ATGQ — CHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCYH 

I I I I I 1 I I I I I I I III: | : | : | | | : 

381 PCHKGALCDTNPLNGQYICTCPQGYKGADCTEDVDECAM ANSNPCEHAGKCVN 

335 VSGA— CLCEAGFAGERCE ARLCPEGLYGIKCDKR- 

11111:11111 t t I |=|- 

4 34 TDGAFHCECLKGYAGPRCEMDINECHSDPCQNDATCLDKIGGFTCLCMPGFKGVHCELEV 

368 CP CHLE NTHSC--HPMSGE 

II I : : 1 I I I I 

4 94 NECQSNPCVNNGQCVDKVNRFQCLCPPGFTGPVCQIDIDDCSSTPCLNGAKCIDHPNGYE 



71; 
141 
142 
184 
200 
223 
260 
251 
320 
281 
380 
334 
433 
367 
493 
384 
553 



Qy 385 CACKPGWSGLYCNET CSPGFYGEAC-QQI 412 

I I I : : I : I : I I : I I : I I I I 

Db 554 CQCATGFTGILCDENIDNCDPDPCHHGQCQDGIDSYTCICNPGYMGAICSDQIDECYSSP 613 

Qy 413 CSCQNGA --DCDS VTG KCTCAPGFK 435 

I : I I I III II I I : I ! i 

Db 614 CLNDGRCIDLVNGYQCNCQPGTSGLNCEINFDDCASNPCMHGVCVDGINRYSCVCSPGFT 673 

Qy 436 G1DC STPCPLGTYGINCSS — RCGCK NDAVCSP-VDGSCT 472 

II I I I I I I : I I I | : : : | : 1 : | | 

Db 67 4 GQRCNIDIDECASNPCRKGATCINDVNGFRCICPEGPHHPSCYSQVNECLSNPCIHGNCT 733 

Qy 473 CKAGWHGVDCSI RCPS GTWGFGCNLT 498 

I I I I I I : I : I I I : I : 

Db 7 34 GGLSGYKCLCDAGWVGVNCEVDKNECLSNPCQNGGTCNNLVNGYRCTCKKGFKGYNCQVN 7 93 

Qy 499 CQ CLNGGAC NTL 510 

I I I I I ! 
Db 794 IDECASNPCLNQGTCFDDVSGYTCHCMLPYTGKNCQTVLAPCSPNPCENAAVCKEAPNFE 853 

Qy 511 DGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGH--CRCLPGWSGVHC 568 

: I : I : : I : : I : I : I I I I I I I I : I I : I 

Db 854 SFSCLCAPGWQGKRCTVDVDE CIS K- PCMNNGVCHNTQGS YVCECP P GFS GMDC 906 

Qy 569 DSVCAEGRWGPNCSLPCYCKNGASCSPDDGI CECAPGFRGTTCQR 613 

: II hllll I : hi III I II 

Db 907 EE DINDCLANPCQNGGSCV — DHVNTFSCQCHPGFIGDKCQTDMNECLSEPCK 957 

Qy 614 ICSPGFYGHRCSQTCPQCVHSS GPCHHITGL CDCLPGF 651 

I I I : I I : I II | | : I : I I I I 

Db 958 NGGTCSDYVNSYTCTCPAGFHGVHCENNIDECTESSCFNGGTC— VDGINSFSCLCPVGF 1015 

Qy 652 TGALC NE VCPSGRFGKNC AGICT CTNNGT 68 0 

Ml II : I I I 1 I I I : I : I I I I 

Db 1016 TGPFCLHDINECSSNPCLNAGTCVDGLGTYRCICPLGYTGKNCQTLVNLCSRSPCKNKGT 1075 

Qy 681 C--NPIDRSCQCYPGWIGSDC SQPCPPA — HWGPNCIHTCNCHNGAFCSAYD-GECK 732 

I I I I I I I : I : I I I I I I : I : I : I : 

Db 1076 CVQEKARPHCLCPPGWDGAYCDVLNVSCKAAALQKGVPVEHLCQ-HSGICINAGNTHHCQ 1134 

Qy 733 CTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADC-DHISG-QCTCRTGFMGRHCEQK — 78 8 

I hllll:: : I I I I : I I I I I I : I I I : I : I I : 

Db 1135 CPLGYTGSYCEEQL DECA-SNPCQHGATCNDFIGGYRCECVPGYQGVNCEYEVD 1187 

Qy 789 CPSGTYGYGCRQICD CLNNSTC-DHITG-T 816 

I I I I I I : I I I I I I I I I 

Db 1188 ECQNQPCQNGGTCIDLVNHFKCSCPPGTRGLLCEENIDECAGGPHCLNGGQCVDRIGGYT 1247 

Qy 817 CYCSPGWKGARCDQAGVIIVGNLN 84 0 

I I II: I lh I : : I 

Db 1248 CRCLPGFAGERCE GDIN 1264 



RESULT 8 
NTC1_BRARE 

ID NTC1_BRARE STANDARD; PRT; 2437 AA. 

AC P46530; 



DT Ol-NOV-1995 (Rel. 32, Created) 

DT Ol-NOV-1995 (Rel. 32, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Neurogenic locus notch homolog protein 1 precursor. 

GN NOTCH1A OR NOTCH. 

OS Brachydanio rerio (Zebrafish) (Danio rerio) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Os tariophysi ; Cyprinif ormes ; 

OC Cyprinidae; Danio. 

OX NCBI_TaxID-7955; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC TISSUE=Embryo; 

RX MEDLINE-94128602; PubMed=82 97 7 9 1 ; 

RA Bierkamp C, Campos-Ortega J. A.; 

RT "A zebrafish homologue of the Drosophila neurogenic gene Notch and 

RT its pattern of transcription during early embryogenesis . " ; 

RL Mech. Dev. 4 3:87-100(1993). 

CC -!- FUNCTION: Implicated in cell fate specifications during 
CC embryo development. May be involved in the formation of the 

CC neural plate, notochord and brain vesicles. 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- DEVELOPMENTAL STAGE: Expressed in all cells in pregas trulation 
CC stages. During gastrulation is differentially expressed, 

CC accumulating predominantly in the prechordal mesoderm and 

CC notochord. At the end of gastrulation, expressed along the 

CC anterior-posterior axis including the developing neural plate 

CC and differentiating mesoderm. Also present in the developing 

CC brain and head regions. 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 36 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 6 ANK repeats. 

CC 7 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X69088; CAA48831.1; -. 

DR PIR; S42612; S42612. 

DR HSSP; P00740; 1EDM. 

DR ZFIN; ZDB-GENE-99 04 15-17 3 ; notchla. 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR000152; Asx_hydroxyl_S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR001438; EGF_II. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002049; Laminin_EGF. 

DR InterPro; IPR008297; Notch. 

DR InterPro; IPR000800; Notch_dom. 

DR Pfarn; PF00023; ank; 6. 

DR Pfarn; PF00008; EGF; 36. 



DR 


Pfam; PF00066; notch; 3. 










DR 


PIRSF; PIRSF002279; Notch; 1. 










DR 


PRINTS; 


PR00010; 


EGFBLOOD . 










DR 


PRINTS; 


PR00011; 


EGFLAMININ. 










DR 


PRINTS; 


PR01452; 


NOTCH. 










i-J L\ 


SMART; SM00248; ANK; 6. 










DR 


SMART; SM0 017 9; EGF^CA; 22. 










DR 


SMART; SM00004; NL; 3. 










DR 


PROSITE; 


PS50297; 


ANK REP REGION; 1. 










PROSITE; 


PS50088; 


ANK REPEAT ; 


4. 








DR 


PROSITE; 


PS00010; 


ASX HYDROXYL ; 2 3 . 








DR 


PROSITE; 


PS00022; 


EGF_1; 34. 
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PROSITE; 


PS01186; 


EGF_2; 28. 










HD 
Dr\ 


PROSITE; 


PS50026; 


EGF_3; 36. 










UK 


PROSITE; 


PS01187; 


EGF CA; 22. 












Receptor 


; Transcription regulation; Activator; Differentiation; 


rvl/v 


Developmental protein; Neurogenesis; Repeat; ANK repeat; 
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EGF-like 
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Transmembrane; Glycoprotein; Signal . 
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DT qTTT.FTD 

UIijUJji x jj 


7 02 


711 


BY 


SIMILARITY. 


FT 
r i 


JJl OUJJl X JJ 


718 


72 8 


BY 


SIMILARITY. 

y~J X X 1 _L J — 144.1 \ -X X X » 


FT 

J. 1 


DT SITLFTD 

JJ 1 JU JJl J- LJ 


723 


737 


BY 


SIMILARITY . 


FT 
£ 1 


DT qTTT.FTD 

JJl OU JjI x jj 


739 


74 8 


BY 


SIMILARITY 

V-j? X 1 1 X X Jiil \ -1- X X ■ 


FT 


DT qTTT.FT D 

Ul OUJJi. X JJ 


755 


7 66 


BY 


SIMILARITY . 


FT 

r 1 


DT ^TTT.FT D 
ul oujjr ijj 


7 fin 


775 


BY 


STMTLARTTY 


FT 


DT QTTT.FTD 

JJlOUlil 1 JJ 


111 
III 


786 


BY 


STMTLARTTY 


FT 


DT qUT,FT D 

JJlJUJjJ. IJJ 


7 93 


8 04 


BY 


SIMILARITY 

K_J X X 11 XliU\X X X • 


TT"T 


DT <^TTT FTD 
JJlDULr 1 U 


7 9 R 


813 


BY 


STMTT.ARTTY 
jij ii j_ijrvxr\_L i j. • 


TTT 

r l 


DT qTTT FTD 
jjioujjr i jj 


R1 S 

Ol J 


ft?4 


BY 


STMTT.ARTTY 


FT 


DT qTTT.FT D 
JJl D U J_i J_ IJJ 


O O 1 


842 


BY 


STMTT.ARTTY 


£ i 


JJl oULr IJJ 


R ^ £ 

O JO 


O JO 


av 

JZ> 1 


q TMTT.ARTTY 

Oll Jl Jjn.J\l IX. 


FT 


DISULFID 


855 


864 


BY 


SIMILARITY. 


FT 


DISULFID 


871 


882 


BY 


SIMILARITY. 


FT 


DISULFID 


876 


891 


BY 


SIMILARITY. 


FT 


DISULFID 


893 


902 


BY 


SIMILARITY. 


FT 


DISULFID 


909 


920 


BY 


SIMILARITY. 



Query Match 14.6%; Score 987; DB 1; Length 2437; 

Best Local Similarity 24.8%; Pred. No. 2.1e-51; 

Matches 310; Conservative 81; Mismatches 320; Indels 538; 



Gaps 70; 



Qy 

Db 

QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



91 KSQCCPGFYESGEMCVPHCADKCVHGRC IAPNTCQCEPGWGGTNCS SACD 14 0 

I I I I I : I : I :: I : I i : I : I I I I I I 

85 KCDCVLGF — SDRLCLTPVNHACMNSPCRNGGTCSLLTLDTFTCRCQPGWSGKTCQLA — 140 

141 GDHWGPHCTSRCQCKNGALCNPITG--ACHCAAGFRGWRCEDRCEQGTYGNDCH-QRCQC 197 

II III I : I I I I I I : I I 

141 DPCASN-PCANGGQCSAFESHYICTCPPNFHGQTCRQDV NECAVSPSPC 188 

198 QNGATCDHVTGE— CRCPPGYTGAFCEDL CPPGK 229 

: I I I I : I II: I I I 

18 9 RNGGTCINEVGSYLCRCPPEYTGPHCQRLYQPCLPSPCRSGGTCVQTSDTTHTCSCLPGF 2 48 

230 HGPQCE QRC PCQNG 243 

III M M I I 

249 TGQTCEHNVDDCTQHACENGGPCIDGINTYNCHCDKHWTGQYCTEDVDECELSPNACQNG 3 08 

244 GVCHHVTG — ECSCPSGWMGTVCGQ PCPEGRFGKN 276 

Ml: I I I : I I i I : Mill 

309 GTCHNTIGGFHCVCWGWTGDDCSENIDDCASAACSHGATCHDRVASFFCECPHGRTGLL 3 68 

277 C--SQEC QCHNGGTCDA — ATGQ--CHCS PGYTGERCQ DECPVGTYGVLCAETC 324 

I | I I II : I : I I I I I I I I I I I : I 

369 CHLDDACISNPCQKGSNCDTNPVSGKAICTCPPGYTGSACNQDIDECSLGAN 420 

325 QCVNGGKCYHVSGA--CLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMS 3 82 



I : I I : I : I : II I : I I I I : Ml II : I : I 

Db 421 PCEHGGRCLNTKGSFQCKCLQGYEGPRCEMDV NEOKSNPC — QNDATCLDQI 470 

Qy 383 G — ECACKPGWSGLYCNET CSPGFYGEACQ QIC 413 

I I I I I : I : : I I I i I I I I 

Db 471 GGFHCICMPGYEGVFCQINSDDCASQPCLNGKCIDKINSFHCECPKGFSGSLCQVDVDEC 530 

Qy 414 S CQNGADCDSVTGK — CTCAPGFKGIDC STPCPL GTYGINCSSR 455 

: I : I I I I I I I I I I I I I I : 1 I I II 

Db 531 ASTPCKNGAKCTDGPNKYTCECTPGFSGIHCELDINECASSPCHYGVCRDGVASFTCDCR 590 

Qy 456 CG CKNDAVCSPVDGS — CTCKAGWHGVDCSIR 485 

I | : I I : : I I I I I I : I I 

Db 591 PGYTGRLCETNINECLSQPCRNGGTCQDRENAYICTCPKGTTGVNCEINIDDCKRKPCDY 650 

Qy 486 CPSGTWGFGCNLTCQ CLNGGACNTLDG TCTCAPGWRG 522 

I I I I I : i I I I I : I I I I I I : I 

Db 651 GKCIDKINGYECVCEPGYSGSMCNINIDDCALNPCHNGGTC — IDGVNS FTCLCPDGFRD 7 08 

Qy 523 EKC ELPCQDGTYGLNC AERC DCSHADGCHP 552 

I 1:1111 I I : I 

Db 709 ATCLSQHNECSSNPCIHGSCLDQINSYRCVCEAGWMGRNCDININECLSNPCVNGGTCKD 768 

Qy 553 -TTGH-CRCLPGWSGVHCDSVCAEGRWGP NCSL 583 

I : I : I I I : i I : I I I Ml 

Db 7 69 MTSGYLCTCRAGFSGPNCQMNINECASNPCLNQGSCIDDVAGFKCNCMLPYTGEVCENVL 82 8 

Qy 584 -PCY CKNGASCSPDD GICE 601 

I I I M I I : Mil 

Db 82 9 APCSPRPCKNGGVCRESEDFQSFSCNCPAGWQGQTCEVDINECVRNPCTNGGVCENLRGG 8 88 

Qy 602 CAPGFRGTTCQR 1 CS PGFYGHRCSQTCPQCV 632 

I || I I M M M I M : : M I 

Db 889 FQCRCNPGFTGALCENDIDDCEPNPCSNGGVCQDRVNGFVCVCLAGFRGERCAEDIDECV 948 

Qy 633 HSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNC AGICT CTNNGTCNPIDR 686 

Ml: I M : I I : MM 111 II I I M I M 

Db 949 — SAPCRNGGNCTDCVNSYT CS--CPAGFSGINCEINTPDCTES SCFNGGTC — VDG 999 

Qy 687 SCQCYPGWIGSDC SQP CPPAHWGPNC IH 714 

I I I I I : M I I M I I : I I I 

Db 1000 ISSFSCVCLPGFTGNYCQHDVNECDSRPCQNGGSCQDGYGTYKCTCPHGYTGLNCQSLVR 1059 

Qy 715 TCN CHNGAFC — SAYDGECKCTPGWTGLYCTQ 744 

I : Mil I M I I M M I 

Db 1060 WCDS S PCKNGGS CWQQGAS FTCQCAS GWTGI YCDVP SVS CEVAARQQGVS VAVLCRHAGQ 1119 

Qy 745 RCPLGFYGKDC ALICQ CQNGADC-DHI SG-QCTCRTGFMGRHCE 786 

II Ml I II M II M M M II M M I 

Db 1120 CVDAGNTHLCRCQAGYTGSYCQEQVDECQPNPCQNGATCTDYLGGYSCECVPGYHGMNCS 1179 

Qy 787 QK CPSGTYGYGCRQICD CLNN 807 

: : I li M I i II 

Db 1180 KEINECLSQPCQNGGTCIDLVNTYKCSCPRGTQGVHCEIDIDDCSPSVDPLTGEPRCFNG 1239 

Qy 808 STC-DHITG-TCYCSPGWKGARCDQAGVIIVGNLNSLSRTSTALPADSY 854 

I I M I I M It M MM : MM 



Db 1240 GRCVDRVGGYGCVCPAGFVGERCE — GDVNE-CLSDPCDPSGSY 1280 



RESULT 9 
NOTC_DROME 

ID NOTC_DROME STANDARD; PRT; 2 7 03 AA. 

AC P07207; 097458; P04154; Q9W4T8; 

DT 01-NOV-1986 (Rel. 03, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Neurogenic locus Notch protein precursor. 

GN N OR EG:140G11.1 OR EG:163A10.2 OR CG3936. 

OS Drosophila melanogas ter (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae ; Drosophila. 

OX NCBI_TaxID=7227; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN=Oregon-R; TI SSUE-Embryo ; 

RX MEDLINE=8 607 953 9; PubMed=3 9 3532 5 ; 

RA Wharton K.A. , Johansen K.M. , Xu T . , Artavanis-Ts akonas S.; 

RT "Nucleotide sequence from the neurogenic locus notch implies a gene 

RT product that shares homology with proteins containing EGF-like 

RT repeats . " ; 

RL Cell 43:567-581(1985). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Canton-S, and Oregon-R; TTSSUE=Embryo ; 

RX MEDLINE=8 7 0 64 62 4; PubMed=3 0 97 5 17 ; 

RA Kidd S., Kelley M.R., Young M.W.; 

RT "Sequence of the notch locus of Drosophila melanogas ter : relationship 

RT of the encoded protein to mammalian clotting and growth factors."; 

RL Mol. Cell. Biol. 6:3094-3108(198 6). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RX MEDLINE=2 01960 06; PubMed=10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. , Galle R.F., 

RA George R.A. , Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G . , Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A. , An H.-J., Andrews-Pf annkoch C, Baldwin D., 

RA Ballew R.M. , Basu A., Baxendale J., Bayraktaroglu L. , Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P . , Brottier P., 

RA Burtis K.C., Busam D.A. , Butler H., Cadieu E . , Center A. , Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B W Davies P. f 

RA de Pablos B . , Delcher A., Deng Z. f Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P. 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera Fleischmann W. 

RA Foster C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K . , 

RA Glodek A., Gong F . f Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A. , Howland T.J., Wei M.-H., Ibegwam C, 



RA Jalali M. , Kalush F. , Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A. , 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y. , Levitsky A. A., Li J.H., Li Z., Liang Y. , Lin X., 

RA Liu X., Mattei B., Mcintosh T.C., McLeod M.P., McPherson D., 

| RA Merkulov G . , Milshina N.V., Mobarry C, Morris J . T Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B. , Murphy L . , Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K. , Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S., Pollard J., Puri V. , Reese M.G., 

RA Reinert K. , Remington K . , Saunders R.D.C., Scheeler F., Shen H. , 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T . , 

RA Spier E . , Spradling A.C., Stapleton M. , Strong R., Sun E . , 

RA Svirskas R., Tector C, Turner R. , Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J . , 

RA Williams S.M., Woodage T . , Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G . , Zhao Q. , Zheng L. , 

RA Zheng X.H., Zhong F.N., Zhong W. , Zhou X., Zhu Zhu X. , Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2 000). 

RN [4] 

RP SEQUENCE FROM N . A. 

RC STRAIN=Oregon-R; 

RX MEDLINE=2 0196 Oil; PubMed=l 07 3 1 137 ; 

RA Benos P.V. , Gatt M.K., Ashburner M . , Murphy L., Harris D., 

RA Barrel! B.G., Ferraz C. , Vidal S., Brun C, Demailles J., Cadieu E. , 

RA Dreano S., Gloux S., Lelaure V., Mottier S., Galibert F. , Borkova D., 

RA Minana B. f Kafatos F.C., Louis C, Siden-Kiamos I., Bolshakov S., 

RA Papagiannakis G. , Spanos L., Cox S., Madueno E. , de Pablos B., 

RA Modolell J., Peter A., Schoettler P., Werner M. , Mourkioti F., 

RA Beinert N., Dowe G. , Schaefer U., Jaeckle H., Bucheton A., 

RA Callister D.M., Campbell L.A., Darlarnitsou A., Henderson N.S., 

RA McMillan P.J., Salles C, Tait E.A., Valenti P., Saunders R.D.C., 

RA Glover D.M. ; 

RT "From sequence to chromosome: the tip of the X chromosome of D. 

RT melanogaster. "; 

RL Science 2 8 7:2220-2222(2000). 

RN [5] 

RP SEQUENCE OF 2505-2611 FROM N.A. 

RX MEDLINE—8 5 0 9 9 329; PubMed-2 981631; 

RA Wharton K.A. , Yedvobnick B., Finnerty V.G., Artavanis-Tsakonas S . ; 

RT "opa: a novel family of transcribed repeats shared by the Notch locus 

1 RT and other developmentally regulated loci in D. melanogaster."; 

RL Cell 40:55-62 (1985) . 

RN [6] 

RP SEQUENCE OF 1-8 FROM N.A. 

RX MEDLINE=87257 8 46; PubMed=3 037 327 ; 

RA Kelley M.R., Kidd S., Berg R.L., Young M.W.; 

RT "Restriction of P-element insertions at the Notch locus of Drosophila 

RT melanogaster."; 

RL Mol. Cell. Biol. 7:1545-154 8(1987). 

RN [7] 

RP INTERACTION WITH DX, AND MUTANT SU42C. 

RX MEDLINE-94 215 4 8 9; PubMed=8 1 62848; 

RA Diederich R.J., Matsuno K., Hing H . , Artavanis-Tsakonas S.; 

RT "Cytosolic interaction between deltex and Notch ankyrin repeats 

RT implicates deltex in the Notch signaling pathway."; 

RL Development 120:473-481(1994). 



RN [8] 

RP INTERACTION WITH DX . 

RX MEDLINE=95401878; PubMed=7 671825 ; 

RA Matsuno K. , Diederich R.J., Go M.J., Blaumueller CM., 

RA Artavanis-Tsakonas S.; 

RT "Deltex acts as a positive regulator of Notch signaling through 

RT interactions with the Notch ankyrin repeats."; 

RL Development 121:2633-2644(1995). 

RN [9] 

RP S3 CLEAVAGE BY PSN. 

RX MEDLINE=992214 87; PubMed=l 02 0 6 64 6 ; 

RA Struhl G . , Greenwald I.; 

RT "Presenilin is required for activity and nuclear access of Notch in 

RT Drosophila . " ; 

RL Nature 3 9 8:522-525(1999). 

RN [10] 

RP S3 CLEAVAGE BY PSN . 

RX MEDLINE=99221488; PubMed=l 02 06 647 ; 

RA Ye Y. , Lukinova N., Fortini M.E.; 

RT "Neurogenic phenotypes and altered Notch processing in Drosophila 

RT Presenilin mutants."; 

RL Nature 398:525-52 9(1999). 

RN [11] 

RP S2 CLEAVAGE BY KUZ . 

RX MEDLINE=2165714 6; PubMed-117 99 0 64 ; 

RA Lieber T., Kidd S., Young M.W. ; 

RT "kuzbanian-mediated cleavage of Drosophila Notch."; 

RL Genes Dev. 16:209-221(2002). 

RN [12] 

RP MUTANT MCD5. 

RX MEDLINE=2157 5 95 6; PubMed-1 17 192 14 ; 

RA Ramain P., Khechumian K. , Seugnet L . , Arbogast N., Ackermann C, 

RA Heitzler P . ; 

RT "Novel Notch alleles reveal a Deltex-dependent pathway repressing 

RT neural fate . " ; 

RL Curr. Biol. 11:1729-1738(2001). 

RN [13] 

RP REVIEW. 

RX MEDLINE=22256570; PubMed=12369105 ; 

RA Portin P . ; 

RT "General outlines of the molecular genetics of the Notch signalling 

RT pathway in Drosophila melanogaster : a review."; 

RL Hereditas 136:8 9-96(2 002). 

CC -!- FUNCTION: Signaling protein, which regulates, with both positive 

CC and negative signals, the differentiation of at least central and 

CC peripheral nervous system and eye, wing disk, oogenesis, segmental 

CC appendages such as antennae and legs, and muscles, through lateral 

CC inhibition or induction. Functions as a receptor for membrane- 

CC bound ligands Delta and Serrate to regulate cell-fate 

CC determination. Upon ligand activation, and releasing from the cell 

CC membrane, the Notch intracellular domain (NICD) forms a 

CC transcriptional activator complex with Su(H) (Suppressor of 

CC hairless) and activates genes of the E(spl) complex. Essential for 

CC proper differentiation of ectoderm. 

CC -!- SUBUNIT: Interacts with Su(H) when activated. Interacts with Dx 
CC via its ANK repeats . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Upon activation and 
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S3 cleavage, it is released from the cell membrane and enters into 
the nucleus in conjunction with Su(H). 

PTM: Upon binding its ligands such as Delta or Serrate, it is 
cleaved (S2 cleavage) in its extracellular domain, close to the 
transmembrane domain. S2 cleavage is probably mediated by Kuz. It 
is then cleaved (S3 cleavage) downstream of its transmembrane 
domain, releasing it from the cell membrane. S3 cleavage requires 
Psn. 

SIMILARITY: Belongs to the NOTCH family. 
SIMILARITY: Contains 36 EGF-like domains. 
SIMILARITY: Contains 3 Lin/Notch repeats. 
SIMILARITY: Contains 6 ANK repeats. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-s ib . ch ) . 



EMBL; 


M16152; 


AAB59220. 1; 




EMBL; 


M16153; 


AAB59220. 1 ; 


JOINED. 


EMBL; 


M16149; 


AAB59220. 1; 


JOINED. 


EMBL; 


M16150; 


AAB59220. 1; 


JOINED . 


EMBL; 


M16151; 


AAB59220. 1; 


JOINED. 


EMBL; 


K03508; 


AAA28725. 1; 




EMBL; 


M13689; 


AAA28725. 1; 


JOINED. 


EMBL; 


K03507; 


AAA28725. 1; 


JOINED. 


EMBL; 


AE003426; AAF45848. 


2; -. 


EMBL; 


AL035436; CAB37610. 


i; -■ 


EMBL; 


AL035395; CAB37610. 


1; JOINED 


EMBL; 


M12175; 


AAA74496. 1; 




EMBL; 


M16025; 


AAA28726. 1; 




Query Match 


14.5 


%; Score 



Best Local Similarity 26.8%; Pred. No. 7.4e-51; 

Matches 290; Conservative 102; Mismatches 297; Indels 395; Gaps 



70; 



Qy 


7 


Db 


502 


Qy 


61 


Db 


542 


Qy 


118 


Db 


577 


Qy 


169 


Db 


627 


Qy 


197 



1 1 



I : I III 



I I 



II : I : I II: I I : I 



I : I : 



I I I I I I I 



I I 



-RCEDR- 
1:11 



-LNDGTC 541 



I I I : I I : I = I 

-CALGF — TGARCQINIDDCQSQPCRNR 57 6 



I I 



-CEQGTYG- 
I : I I I 



-NDCHQRCQ 19 6 
I : I I 



: I : I I I : I I II 



II I III 



I I 



Db 68 6 CNNGATCIDGINS YKCQCVPGFTGQHCE KNVDECI S - S PCANNGVC I DQVNG YK 738 

Qy 253 CSCPSGWMGTVC GQP CPEGRFGKNCS QECQ — 2 82 

I I I I : I I Ml III II 

Db 739 CECPRGFYDAHCLSDVDECASNPCVNEGRCEDGINEFICHCPPGYTGKRCELDIDECSSN 798 

Qy 283 -CHNGGTC-DAATG-QCHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCY-HV 335 

I : I I I I I I I I I I i I : : I : I : I 1 I I I I I I 

Db 799 PCQHGGTCYDKLNAFSCQCMPGYTGQKCETNIDDC VTNPCGNGGTCIDKV 848 

Qy 336 SG-ACLCEAGFAGERCEARLCPEGLYGIKC-DKRCPCHLENTHSCHPMSG ECACKP 38 9 

: I I : I : I I I I : : : I I II : I III III 

Db 84 9 NGYKCVCKVPFTGRDCESKMDP CASNRC KNEAKCTPSSNFLDFSCTCKL 897 

Qy 390 GWSGLYCNETCSPGFYGEACQQICSCQNGADCDSVTG— KCTCAPGFKGIDC 439 

I : : I I I : I : I I : I I I I : I I : I I I : : I I I 

Db 898 GYTGRYCDEDI DECSLSSPCRNGASCLNVPGSYRCLCTKGYEGRDCAINTDDCA 951 

Qy 440 STPCP LGTYGINC SSRC GCKNDAVCSPVDGS — CTCK 474 

Ml : I I I I I : I I I I I I I I 

Db 952 SFPCQNGGTCLDGIGDYSCLCVDGFDGKHCETDINECLSQPCQNGATCSQYVNSYTCTCP 1011 

Qy 475 AGWHGVDCSIR CPSGTWGFGCN LTCQ CLN 503 

I : I : : I I I I : I : II I II 

Db 1012 LGFSGINCQTNDEDCTESSCLNGGSCIDGINGYNCSCLAGYSGANCQYKLNKCDSNPCLN 1071 

Qy 504 GGACNTLDG — TCTCAPGWRGEKC ELPCQDGTYGLNCAERCDCSHADGCHPT 553 

I I : : III I : I : : I : I I : : I I I I 

Db 1072 GATCHEQNNEYTCHCPSGFTGKQCSEYVDWCGQSPCENG ATCSQMK — HQF 112 0 

Qy 554 TGHCRCLPGWSGVHCD SVCAEGRWGPNCSLPCYCKNGASCSP — DDGICECAPGFRG 608 

: I : I I I : I I I I : II I I I : I : : | | : | : | 

Db 1121 S— CKCSAGWTGKLCDVQTISCQDAADRKGLSLRQLCNNG-TCKDYGNSHVCYCSQGYAG 1177 

Qy 609 TTCQR 1 CSPGFYGHRCSQTCPQCV HSSGPCH 639 

: I I : I III I I : I I I 

Db 1178 SYCQKEIDECQSQPCQNGGTCRDLIGAYECQCRQGFQGQNCELNIDDCAPNPCQNGGTCH 1237 

Qy 640 H — ITGLCDCLPGFTGALC NEVCPSGRFGKNCAGICTCTNNGTCNPIDR SCQC 690 

: I I I I I : I : I I I I I I : I I I I II 

Db 1238 DRVMNFSCSCPPGTMGIICEINKDDCKPG ACHNNGSC — I DRVGGFECVC 1285 

Qy 691 YPGWIGSDC SQPCP PAHWGPNCIHTCN 717 

I I : : I : I III I I I : I I : 

Db 1286 QPGFVGARCEGDINECLSNPCSNAGTLDCVQLVNNYHCNCRPGHMGRHCEHKVDFCAQSP 1345 

Qy 718 CHNGAFCSAYDGECKCTPGWTGLYCTQRCPLGFYGKDCALICQCQNGADCDHIS GQC 774 

III I : : I : I I I I I I I : I I : I I I I II 

Db 1346 CQNGGNCNIRQ SGHHCI — CNNGFYGKNCEL SGQDCDSNPCRVGNC 1389 

Qy 775 TCRTGFMGRHCEQKCPSGTYGYGCR QICD CLNNSTCDH1TG — TCYCSPGWKG 825 

I I I I I I I I I I I : | : : | II III 

Db 1390 WADEGFGYRCE--CPRGTLGEHCEIDTLDECSPNPCAQGAACEDLLGDYECLCPSKWKG 1447 

Qy 826 ARCD 829 

III 

Db 1448 KRCD 1451 



RESULT 10 
NTC3 MOUSE 



ID NTC3_MOUSE STANDARD; PRT; 2318 AA. 

AC Q61982; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Neurogenic locus notch homolog protein 3 precursor (Notch 3) . 

GN NOTCH3 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=ICR X Swiss Webster; 

RX MEDLINE=95001556; PubMed-7 918 0 97 ; 

RA Lardelli M. , Dalstrand J. , Lendahl U.; 

RT "The novel Notch homologue mouse Notch 3 lacks specific epidermal 

RT growth factor-repeats and is expressed in proliferating 

RT neuroepithelium. " ; 

RL Mech. Dev. 4 6:123-136(1994). 

RN [2] 

RP POST-TRANS LAT I ONAL PROCESSING, AND MUTAGENESIS OF MET-1664 . 

RX MEDLINE=21523956; PubMed=l 15 1 8 7 1 8 ; 

RA Saxena M.T., Schroeter E.H., Mumm J.S., Kopan R. ; 

RT "Murine notch homologs (Nl-4) undergo presenilin-dependent 

RT proteolysis."; 

RL J. Biol. Chem. 276:40268-40273(2001). 

RN [3] 

RP POST-TRANS LAT I ONAL PROCESSING. 

RX MEDLINE=21374376; PubMed=11459941 ; 

RA Mizutani T., Taniguchi Y . , Aoki T., Hashimoto N. , Honjo T . ; 

RT "Conservation of the biochemical mechanisms of signal transduction 

RT among mammalian Notch family members."; 

RL Proc. Natl. Acad. Sci. U.S.A. 98:902 6-9031(2 001). 

CC FUNCTION: Functions as a receptor for membrane-bound ligands 

CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP-J kappa and activates genes of the enhancer of split locus. 

CC Affects the implementation of differentiation, proliferation and 

CC apoptotic programs (By similarity) . May play a role during CNS 

CC development. 

CC -!- SUBUNIT: Heterodimer of a C~terminal fragment N (TM) and a N- 

CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds. 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 

CC proteolytical processing NICD is translocated to the nucleus. 

CC -!- TISSUE SPECIFICITY: Proliferating neuroepithelium. 

CC DEVELOPMENTAL STAGE: CNS development. 

CC -!- PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-accessible form. Cleavage results in a C- 



CC terminal fragment N(TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF-alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT) . This fragment is then 

CC cleaved by presenilin dependent gamma-secretase to release a 

CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane. 

CC -!- PTM: Phosphorylated. 

CC SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 34 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 5 ANK repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X74760; CAA52776.1; -. 

DR PIR; S45306; S45306. 

DR HSSP; P00740; 1EDM. 

DR MGD; MGI : 99460; Notch3. 

DR GO; GO: 0005887; C:integral to plasma membrane; IC. 

DR GO; GO: 0005515; F:protein binding; IPI. 

DR GO; GO: 0007219; P:N signaling pathway; IC . 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR000152; Asx_hydroxyl__S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR001438; EGF_II. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002049; Laminin_EGF. 

DR InterPro; IPR008297; Notch. 

DR InterPro; IPR000800; Notch_dom. 

DR Pfam; PF00023; ank; 6. 

DR Pfam; PF00008; EGF; 33. 

DR Pfam; PF00066; notch; 3. 

DR PIRSF; PIRSF002279; Notch; 1. 

DR PRINTS; PR00010; EGFBLOOD . 

DR PRINTS; PR00011; EGFLAMININ. 

DR PRINTS; PR01452; NOTCH. 

DR SMART; SM00248; ANK; 6. 

DR SMART; SM0 017 9; EGF_CA; 19. 

DR SMART; SM00004; NL; 3. 

DR PROSITE; PS50297; 7\NK_REP_REGION ; 1. 

DR PROSITE; PS50088; AN K_RE P EAT ; 4. 

DR PROSITE; PS 00 010; ASX_HYDROXYL; 18. 

DR PROSITE; PS00022; EGF_1; 33. 

DR PROSITE; PS01186; EGF_2 ; 27. 

DR PROSITE; PS50026; EGF_3; 34. 

DR PROSITE; PS01187; EGF_CA; 16. 

KW Receptor; Transcription regulation; Activator; Differentiation; 

KW Developmental protein; Repeat; ANK repeat; EGF-like domain; 

KW Transmembrane; Glycoprotein; Signal; Phosphorylation. 
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Query Match 14.5%; Score 977.5; DB 1; Length 2318; 

Best Local Similarity 25.8%; Pred. No. 7.2e-51; 

Matches 322; Conservative 77; Mismatches 317; Inciels 532; Gaps 74 

9 LSFICLLLCHWIGTASPLNLEDPNVCSKWESYSVTVQESYPHPFDQIYYTSCTDILNWFK 68 

I III i : 1 I I I I I II : : I I 

62 LEAACLCLPGWVG — ERCQLEDP — C HSGPCAGRGVCQSSWAGTARFS 106 

69 CTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCV PHCADKCVHGR — CIAPN- 122 

I : I I I I I I : I I I I : I : 

107 C RCLRGF--QGPDCSQPDPCVSRPCVHGAPCSVGPDG 141 

123 — TCQCEPGWGGTNCSSACDGDHWGPHC TSRCQ 153 

I I I I : I : I I I II : III 

142 RFACACPPGYQGQSCQSDIDECRSGTTCRHGGTCLNTPGSFRCQCPLGYTGLLCENPWP 2 01 

154 CKNGALC NPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATC-D 204 

I : I I i : : I II III II : II : I I I I I I 

202 CAPSPCRNGGTCRQSSDVTYDCACLPGFEGQNCEVNVD DCPGH-RCLNGGTCVD 254 

2 05 HV-TGECRCPPGYTGAFC-EDLCPPGKHGPQCE-QRCPCQNGGVCHHVTG — ECSCPSGW 2 59 

II I : I II : I I I I I I : : I : I I II I I ' ' I I I : I I 

2 55 GVNTYNCQCPPEWTGQFCTEDV DECQLQPNACHNGGTCFNLLGGHSCVCVNGW 307 



Qy 

Db 

QY 
Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 



Qy 260 MGTVCGQ PCPEGRFGKNC--SQEC QCHNGGTC 289 

III I I I : I I I I I I 

Db 308 TGESCSQNIDDCATAVCFHGATCHDRVASFYCACPMGKTGLLCHLDDACVSNPCHEDAIC 367 

Qy 290 DA--ATGQ--CHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNGGKCYHVSGA — CL 340 

I : I : I I I I : I I I I I I : I | | : : | : | : I : | 

Db 368 DTNPVSGRAICTCPPGFTGGACDQDVDECSIG ANPCEHL--GRCVNTQGSFLCQ 419 

Qy 341 CEAGFAGERCEARL — CPEGLYGIKCDKRCPCHLENTHSCHPMSGE — CACKPGWSGLYC 396 

| | : I I I I : I I 1 I I : I I : I I I : : I I I 

Db 42 0 CGRGYTGPRCETDWECLSG PC — RNQATCLDRIGQFTCICMAGFTGTYC 4 67 

Qy 397 NE TCSPGFYGEACQ QICS CQNGADC-DSV 424 

I I I I I I I I : I : I II I I 

Db 468 EVDIDECQSSPCVNGGVCKDRVNGFSCTCPSGFSGSMCQLDVDECASTPCRNGAKCVDQP 527 

Qy 425 TG-KCTCAPGFKGI DCS-TPCPLGTYGINCSSRCGCKNDAVCSPVDG SC 471 

I : I I I I I : I I I I I I I II Ml M 

Db 52 8 DGYECRCAEGFEGTLCERNVDDCSPDPCHHG RC VDGIASFSC 569 

Qy 472 TCKAGWHGVDCS IRCPSGTWGFGCNLTC-QCLNG- 504 

I I : I : I I I I I I I I : I : 

Db 570 ACAPGYTGIRCESQVDECRSQPCRYGGKCLDLVDKYLCRCPPGTTGVNCEVNIDDCASNP 629 

Qy 505 GACNTLDG TCTCAPGWRGEKCEL PCQDGTYGLNC 538 

II II I I II: I I : I I I I : I 

Db 630 CTFGVCR--DGINRYDCVCQPGFTGPLCNVEINECASSPCGEGGSCVDGENGFHCLCPPG 687 

Qy 53 9 AERCDCSHADGCHPTTG — HCRCLPGWSGVHCDSVCAEGRWGPNCSLP 58 4 

I I I II I I I I I I I I I I I : 

Db 688 SLPPLCLPANHPCAHKPCSHG-VCHDAPGGFRCVCEPGWSGPRCSQSLA PDACES 741 

Qy 585 CYCKNGASCSPDDGI CECAPGFRGTTCQRI CS 616 

I : I : I : I I I I I I I i I : I I : : I 
Db 742 QPCQAGGTCT-SDGIGFRCTCAPGFQGHQCEVLSPCTPSLCEHGGHCESDPDRLTVCSCP 800 

Qy 617 PGFYGHRCSQTCPQCVHSS GPCHHITG — LCDCLPGFTGALCNE 658 

Ihllll : I : I I I : : I I I 1 = 11 I : : 

Db 801 PGWQGPRCQQDVDECAGASPCGPHGTCTNLPGNFRCICHRGYTGPFCDQDIDDCDPNPCL 860 

Qy 659 VCPSGRFGKNCA GICT 674 

I I I I I III 
Db 861 HGGSCQDGVGSFSCSCLDGFAGPRCARDVDECLSSPCGPGTCTDHVASFTCACPPGYGGF 920 

Qy 675 CTNNGTCNPIDR SCQCYPGWIGSDC SQP C 703 

I I I I I : I | | | | | : | : I I : I I 

Db 921 HCEIDLPDCSPSSCFNGGTC— VDGVSSFSCLCRPGYTGTHCQYEADPCFSRPCLHGGIC 978 

Qy 704 PPAHWGPNCIHTCN CHNGAFCSAYDGECKCTPGWTG 739 

I I I I I I I II I I I I I I : I 

Db 979 NPTHPGFEC--TCREGFTGSQCQNPVDWCSQAPCQNGGRCVQTGAYCICPPGWSGRLCDI 1036 

Qy 740 --LYCTQR CPLGFYGKDC ALICQCQN 763 

III: I I I I I 

Db 1037 QSLPCTEAAAQMGVRLEQLCQEGGKCIDKGRSHYCVCPEGRTGSHCEHEVDPCTAQPCQH 1096 



Qy 764 GADCDHISG--QCTCRTGFMGRHCEQ KCPSGTYGY 796 

II I 1 I I : I I I I I I I I 

Db 1097 GGTCRGYMGGYVCECPAGYAGDSCEDNIDECASQPCQNGGSCIDLVARYLCSCPPGTLGV 1156 

Qy 797 GC RQICD CLNNSTCDHITG--TCYCSPGWKGARCD 829 

1 II I I : I I I : I I I I I : I I : 

Db 1157 LCEINEDDCDLGPSLDSGVQCLHNGTCVDLVGGFRCNCPPGYTGLHCE 1204 



RESULT 11 
NTC3JHUMAN 

ID NTC3_HUMAN STANDARD; PRT; 2321 AA. 

AC Q9UM47; Q9UEB3; Q9UPL3 ; Q9Y6L8; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Neurogenic locus notch homolog protein 3 precursor (Notch 3). 

GN NOTCH3 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9 606; 

RN [1] 

RP SEQUENCE FROM N . A. 

RX MEDLINE-97 032 72 8; PubMed=8 8 7 8 47 8 ; 

RA Joutel A., Corpechot C, Ducros A., Vahedi K., Chabriat H., Mouton P., 

RA Alamowitch S., Domenga V., Cecillion M., Marechal E, , Maciazek J., 

RA Vayssiere C, Cruaud C. , Cabanis E.-A. , Ruchoux M.M., Weissenbach J., 

RA Bach J.-F., Bousser M.-G., Tournier-Lasserve E.; 

RT "Notch3 mutations in CADASIL, a hereditary adult-onset condition 

RT causing stroke and dementia. "; 

RL Nature 383:7 07-710(1996). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Gunel M. , Artavanis-Tsakonas S.; 

RL Submitted (APR-1998) to the EMBL/GenBank/DDB J databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Lamerdin J.E., McCready P.M., Skowronski E., Adamson A.W., 

RA Burkhart-Schultz K. , Gordon L., Kyle A., Ramirez M., Stilwagen S., 

RA Phan H . , Velasco N., Games J., Danganan L., Poundstone P., 

RA Christensen M. , Georgescu A., Avila J., Liu S., Attix C, Andreise T., 

RA Trankheim M. , Amico-Keller G. , Coefield J., Duarte S., Lucas S., 

RA Bruce R. , Thomas P., Quan G., Kronmiller B . , Arellano A., 

RA Montgomery M. , Ow D . , Nolan M. , Trong S., Kobayashi A., Olsen A. S . , 

RA Carrano A.V. ; 

RT "Sequence analysis of an 1 . 5 Mb olfactory receptor (OLFR) cluster in 

RT 19pl3.1."; 

RL Submitted (MAY-1998) to the EMBL/GenBank/DDB J databases. 

RN [4] 

RP VARIANTS CADASIL TYR-49; CYS-71; CYS-90; CYS-110; CYS-133; CYS-141; 

RP ARG-146; CYS-153; CYS-169; CYS-171; CYS-182; ARG-185; SER-212; 

RP GLY-222; TYR-224; CYS-258; TYR-542; CYS-558; CYS-578; CYS-728; 

RP CYS-985; CYS-1006; CYS-1031; CYS-1231 AND ARG-1261, AND VARIANTS 

RP ARG-170; LEU-496; GLN-1133; MET-1183 AND ALA-2223. 

RX MEDLINE-9 8 04 975 3; PubMed-9 3 8 8 3 9 9 ; 

RA Joutel A., Vahedi K., Corpechot C, Troesch A., Chabriat H., 



RA Vayssiere C, Cruaud C, Maciazek J., Weissenbach J . f Bousser M.-G., 

RA Bach J.-F., Tournier-Las serve E.; 

RT "Strong clustering and stereotyped nature of Notch3 mutations in 

RT C ADAS I L patients."; 

RL Lancet 350:1511-1515(1997). 

RN [5] 

RP VARIANT CADASIL 1 1 4-GLY--PRO- 12 0 DEL . 

RX MEDLINE-20264473; PubMed=l 0 8 02 8 07 ; 

RA Joutel A., Chabriat H., Vahedi K. , Domenga V., Vayssiere C, 

RA Ruchoux M.M., Lucas C, Leys D. f Bousser M.-G., Tournier-Las serve E . ; 

RT "Splice site mutation causing a seven amino acid Notch3 in-frame 

RT deletion in CADASIL."; 

RL Neurology 54:1874-1875(2 000). 

RN [6] 

RP IDENTIFICATION OF LIGANDS. 

RX MEDLINE=9918 07 65; PubMed-1 0 07 92 5 6 ; 

RA Gray G.E., Mann R.S., Mitsiadis E . , Henrique D. , Carcangiu M.-L., 

RA Banks A . , Leiman J., Ward D., Ish-Horowitz D., Artavanis-Tsakonas S.; 

RT "Human ligands of the Notch receptor."; 

RL Am. J. Pathol. 154:7 85-7 94(1999). 

CC -!- FUNCTION: Functions as a receptor for membrane-bound ligands 
CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP- J kappa and activates genes of the enhancer of split locus . 

CC Affects the implementation of dif f erentiation, proliferation and 

CC apoptotic programs (By similarity) . 

CC -!- SUBUNIT: Heterodimer of a C-terminal fragment N(TM) and a N- 
CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds (By similarity) . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 
CC proteolytical processing NICD is translocated to the nucleus. 

CC -!- TISSUE SPECIFICITY: Ubiquitously expressed in fetal and adult 
CC tissues . 

CC -!- PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-acces sible form. Cleavage results in a C- 

CC terminal fragment N(TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF-alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT) . This fragment is then 

CC cleaved by presenilin dependent gamma-secretase to release a 

CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane (By similarity) . 

CC -!- PTM: Phosphorylated (By similarity). 

CC -!- DISEASE: Defects in NOTCH3 are associated with cerebral autosomal 

CC dominant arteriopathy with subcortical infarcts and 

CC leukoencephalopathy (CADASIL) [MIM: 125310] . CADASIL causes a type 

CC of stroke and dementia of which key features include recurrent 

CC subcortical ischemic events and vascular dementia. 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 34 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 5 ANK repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 
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between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 
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Query Match 14.4%; Score 974; DB 1; Length 2321; 

Best Local Similarity 25.0%; Pred. No. 1.2e-50; 
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Qy 5 9 SCTDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESGEMCVPHCADKC 113 

: | I : I : I I I I : : I : I I : I 

Db 250 TCVDGVNTYNC QCPPEW--TGQFCTED-VDECQLQPN 283 

Qy 114 ^VH--GRC IAPNTCQCEPGWGGTNCSSACDG DHWGPHCTSR CQC 154 

I I I : : : I I I I I : I I I 111 I I 

Db 284 ACHNGGTCFNTLGGHSCVCVNGWTGESCSQNIDDCATAVCFHGATCHDRVASFYCACPMG 343 

Qy 155 KNGALC NPITG— ACHCAAGFRGWRCE DRCE 183 

I I I I I I : i I I I I I I : I ! 

Db 344 KTGLLCHLDDACVSNPCHEDAICDTNPVNGRAICTCPPGFTGGACDQDVDECSIGANPCE 403 



Qy 184 QGTYGNDCHQ RCQ CQNGATCDHVTGE--CRCPPGYTG 218 

||:: | : I I : hill I h II h I I 

Db 4 04 HLGRCVNTQGSFLCQCGRGYTGPRCETDVNECLSGPCRNQATCLDRIGQFTCICMAGFTG 4 63 

Qy 219 AFCE DLCPPGKHGPQCEQRCPCQNGGVC-HHVTG-ECSCPSGWMGTVC — 264 

: | | || I 1 I I I I I I II h I I I I: h I 

Db 464 TYCEVDIDEC QSSPCVNGGVCKDRVNGFSCTCPSGFSGSTCQLDVDECAS 513 

Qy 2 65 GQP CPEGRFGKNCSQ ECQ CHNGGTCDA-ATGQCHCSPG 301 

II Mill: : I I h I I h I h I I 

Db 514 TPCRNGAKCVDQPDGYECRCAEGFEGTLCDRNVDDCSPDPCHHGRCVDGIASFSCACAPG 573 

Qy 3 02 YTGERCQDE CPVGTYGVLC AETCQ 32 5 

Mill:: I I M I I I : I 

Db 57 4 YTGTRCESQVDECRSQPCRHGGKCLDLVDKYLCRCPSGTTGVNCEVNIDDCASNPCTFGV 633 

Qy 326 CVNGGKCYHVSGACLCEAGFAGERCEARL CPEGLYGIKCDKRCP 369 

I : I I I : I : I I I I : 1=1 1=1 II 

Db 634 CRDGINRYD CVCQPGFTGPLCNVEINECASSPCGEGGSCVDGENGFRC--LCPPGS 637 

Qy 370 CHLENTHSC HPMSGECACKPGWSGLYCNE 398 

| I : I I I I I : I I I I I h : 

Db 688 LPPLC-LPPSHPCAHEPCSHGICYDAPGGFRCVCEPGWSGPRCSQSLARDACESQPCRAG 746 

Qy 399 TCSPGFYGEACQQI — CS CQNGADCDSVTGK CTCAPGFKG— 436 

II I I I I : : I = h M hi h hi h : I 
Db 747 GTCSSDGMGFHCTCPPGVQGRQCELLSPCTPNPCEHGGRCESAPGQLPVCSCPQGWQGPR 806 

Qy 437 IDCSTPCPLGTYGINCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTW 491 

: h I I I Ml h : I I II I i : ! I 

Db 8 07 CQQDVDECAGPAPCGPHGI - CTNLAG SFSCTCHGGYTGPSCDQDIND 852 

Qy 4 92 GFGCNLTCQCLNGGACNTLDG TCTCAPGWRGEKC ELPCQDGTYGLNCA 539 

I : I I I I I M II :| M I h I M II M I 
Db 853 CDPN-PCLNGGSCQ— DGVGSFSCSCLPGFAGPRCARDVDECLSNPCGPGT CT 902 

Qy 540 ERCDCSHADGCHPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGI 599 

: I : I I I h I I h h I I I I M M I h 

Db 903 D HVASFTCTCPPGYGGFHCEQDL PDCS-PSSCFNGGTCV--DGV 943 

Qy 600 CECAPGFRGTTCQR ICSPGFYGHRC SQTCPQC 631 

I I II: I II Ml III I I II I 

Db 944 NSFSCLCRPGYTGAHCQHEADPCLSRPCLHGGVCSAAHPGFRCTCLESFTGPQCQTLVDW 1003 

Qy 632 VHSSGPCHHITGLCDCLPGFTGALCNE 658 

: I I I I I I : M I h 

Db 1004 CSRQPCQNGGRCVQTGAYCLCPPGWSGRLCDIRSLPCREAAAQIGVRLEQLCQAGGQCVD 1063 

Qy 659 VCPSGRFGKNC AGIC TCTNNGTCNPI --DRSCQCYPGWIGSDC 699 

I M II I M I I : 11 I I M I h I M 

Db 1064 EDSSHYCVCPEGRTGSHCEQEVDPCLAQPCQHGGTCRGYMGGYMCECLPGYNGDNCEDDV 1123 

Qy 700 SQPC PPAHWGPNCIHTCNCH 719 

MM I I I I h I I 

Db 1124 DECASQPCQHGGSCIDLVARYLCSCPPGTLGVLCEINEDDCGPGPPLDSGPRCLHNGTCV 1183 



Qy 



720 N- 



-GAFCSAYDGECKCTPGWTGLYCTQ- 



RCPLG 749 



Db 


1184 


Qy 


750 


Db 


1238 


Qy 


7 R 
/ o o 


Db 


1298 


Qy 


815 


Db 


1358 



: I I I I Ihlll I II 

DLVGGF RCTCPPGYTGLRCEADINECRSGACHAAHTRDCLQDPGGGFRCLCHAG 12 3 r 

FYGKDCALI CQ---CQNGADCDRT SG QCTCRTGFMGRHCEQ 7 87 

Ml: I : I I : I I I I I 1111 = 



-KCPSGTYGYGCRQI CDCLNNSTCDHIT 814 

I I I I I I I I : : I 



I : 1 I I II: 



RESULT 12 
NTC3 RAT 



ID NTC3_RAT STANDARD; PRT; 2 319 AA. 

AC Q9R172; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Neurogenic locus notch hornolog protein 3 precursor (Notch 3) . 

GN NOTCH3 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Haritunians T . , Boulter J . , Weinmaster G., Schanen N.C.; 

RT "Rattus norvegicus mRNA for Notch 3."; 

RL Submitted (SEP-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP FUNCTION. 

RX MEDLINE=210 9450 8; PubMed=lll 82 08 0 ; 

RA Tanigaki K. , Nogaki F . , Takahashi J., Tashiro K . , Kurooka H., 

RA Hon jo T. ; 

RT "Notchl and Notch3 instructively restrict bFGF-responsive multipotent 

RT neural progenitor cells to an astroglial fate."; 

RL Neuron 29:45-55(2001). 

RN [3] 

RP TISSUE SPECIFICITY. 

RX MEDLINE-2133178 9; PubMed-11438 92 2 ; 

RA Irvin D.K., Zurcher S.D., Nguyen T . , Weinmaster G. , Kornblum H.I.; 

RT "Expression patterns of Notchl, Notch2, and Notch3 suggest multiple 

RT functional roles for the Notch-DSL signaling system during brain 

RT development."; 

RL J. Comp. Neurol. 436:167-181(2001). 

CC FUNCTION: Functions as a receptor for membrane-bound ligands 

CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP-J kappa and activates genes of the enhancer of split locus. 

CC Affects the implementation of differentiation, proliferation and 

CC apoptotic programs (By similarity) . Acts instructively to control 



CC the cell fate determination of CNS multipotent progenitor cells, 

CC resulting in astroglial induction and neuron/ oligodendrocyte 

CC suppression. 

CC SUBUNIT: Heterodimer of a C-terminal fragment N(TM) and a N™ 

CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds (By similarity) . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 

CC proteolytical processing NICD is translocated to the nucleus. 

CC -!- TISSUE SPECIFICITY: Expressed in postnatal central nervous system 

CC (CNS) germinal zones and, in early postnatal life, within 

CC numerous cells throughout the CNS. It is more highly localized 

CC to ventricular germinal zones . 

CC PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-accessible form. Cleavage results in a C~ 

CC terminal fragment N(TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF-alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT). This fragment is then 

CC cleaved by presenilin dependent gamma-secretas e to release a 

CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane (By similarity) . 

CC -!- PTM: Phosphorylated (By similarity). 

CC -!- SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 34 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 5 ANK repeats. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch ) . 

CC 

DR EMBL; AF164486; AAD46653.2; 

DR HSSP; P00740; 1EDM. 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR000152; Asx_hydroxyl_S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGFjCa . 

DR InterPro; IPR001438; EGFJCI . 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR002049; LamininJEGF. 

DR InterPro; IPR008297; Notch. 

DR InterPro; IPR000800; Notch_dom. 

DR Pfam; PF00023; ank; 6. 

DR Pfam; PF00008; EGF; 33. 

DR Pfam; PF00066; notch; 3. 

DR PIRSF; PIRSF002279; Notch; 1. 

DR PRINTS; PR00010; EGFBLOOD . 

DR PRINTS; PR00011; EGFLAMININ. 

DR PRINTS; PR01452; NOTCH. 

DR SMART; SM00248; ANK; 6. 

DR SMART; SM0017 9; EGF CA; 20. 
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Query Match 14.4%; Score 969.5; DB 1; Length 2319; 

Best Local Similarity 25.1%; Pred. No. 2.2e-50; 

Matches 303; Conservative 71; Mismatches 304; Indels 531; 



Gaps 62; 
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107 PHCAD KCVHGRCIAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGA 158 

I I I : I I : I I II 1 I I I I II I I I : : 

42 PPCLDGSPCANGGRCTHQQPSREAACLCLPGWVGERCQLE-DPCHSGP-CAGRGVCQSSV 9 9 
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159 LCNPITGACHCAAGFRGWRCE--DRC 
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: : I I I I I I I I I > I 

Db 100 VAGVARFSCRCLRGFRGPDCSLPDPCFSSPCAHGAPCSVGSDGRYACACPPGYQGRNCRS 159 

Qy 189 --NDCHQRCQCQNGATCDHVTG--ECRCPPGYTGAFCED LCPPGKHGPQCEQRCPCQ 241 

: : I | : : | M : I I I I I I I I I I : I I I I : 

Db 160 DIDECRAGASCRHGGTCINTPGSFHCLCPLGYTGLLCENPIVPCAPS PCR 209 

Qy 242 NGGVCHH VTGECSCPSGWMGTVCG QPCPEGRFGKNCSQECQCHNGGTC — DAAT 293 

I I I I I I : I : I I : I I 111 I I I I I I I 

Db 210 NGGTCRQSSDVTYDCACLPGFEGQNCEVNVDDCPGHR CLNGGTCVDGVNT 25 9 

Qy 2 94 GQCHCSPGYTGERCQ DEC PVGT YGVLCAETCQCWGGKCYHVS G- -ACLCEAGFAGE 34 8 

I I I : I I : I III: : I I I I I : : : I : I : I I : I I 
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I : I I | | | | : : | I : : 
Db 372 PVSGRAICTCPPGFTGGACDQDVDECSIGANPCEHLGRCVNTQGSFLCQCGRGYTGPRCE 431 

Qy 399 TCSPGFYGEACQ QICSCQNGADC-DSVT 425 

I I I I I : I I I I I I I 

Db 432 TDVNECLSGPCRNQATCLDRIGQFTCICMAGFTGTFCEVDIDECQSSPCVNGGVCKDRVN 491 

Qy 426 G-KCTCAPGFKGIDC STPCPLGTY GINCSSRCG C-KNDA 462 

I I I I I I I I I I I I I I I I I I : I 

Db 4 92 GFSCTCPSGFSGSTCQLDVDECASTPCRNGAKCVDQPDGYEC — RCAEGFEGTLCERNVD 54 9 

Qy 463 VCSP VDG SCTCKAGWHGVDCS IR 485 

Ml III | | | 1 : I : | I 

Db 550 DCSPDPCHHGRCVDGIASFSCACAPGYTGIRCESQVDECRSQPCRYGGKCLDLVDKYLCR 609 

Qy 486 CPSGTWGFGCNLTC-QCLNG GACNTLDG TCTCAPGWRGEKCEL 527 

I I I I I I : i : II II 1111:11: 

Db 610 CPPGTTGVNCEVNIDDCASNPCTFGVCR — DGINRYDCVCQPGFTGPLCNVE1NECAS SP 667 

Qy 528 PCQDGTYGLNC AERCDCSHADGCHPTTG — HCRCLPGW 563 

I I I I : I I II II i Mill 

Db 668 CGEGGSCVDGENGFHCLCPPGSLPPLCLPANHPCAHKPCSHG-VCHDAPGGFQCVCDPGW 72 6 

Qy 564 SGVHCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGI CECAPGFRGTTCQRI 614 

|| | I I : I : I : I : III I I I I I I : I I : : 

Db 727 SGPRCSQSLA PDACESQPCQAGGTCT- SDGIGFHCTCAPGFQGHQCEVLSPCTPS 780 

Qy 615 CSPGFYGHRCSQTCPQCVHSS GPCHHITG — LCDCL 648 

I I I : I I I I : I : I I I : : I II 

Db 781 LCEHGGHCESDPDQLTVCSCPPGWQGPRCQQDVDECAGASPCGPHGTCTNLPGSFRCICH 840 

Qy 64 9 PGFTGALCNE VCPSGRFGKNCA 67 0 

| : I I I : : I I I I I I 

Db 841 GGYTGPFCDQDIDDCDPNPCLNGGSCQDGVGSFSCSCLSGFAGPRCARDVDECLSSPCGP 900 

Qy 671 GICT CTNNGTCNPIDR SCQCYPGWIG 696 

Ml I I I I I : I I I I I I : I 
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RESULT 13 
NTC4_HUMAN 

ID NTC4_HUMAN STANDARD; PRT; 2 0 03 AA. 

AC Q99466; 000306; Q99458; Q99940; Q9H3S8; Q9UII9; Q9UIJ0; 

DT 28-FEB-2003 (Rel. 41, Created) 

DT 28-FEB-2003 (Rel. 41, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Neurogenic locus notch homolog protein 4 precursor (Notch 4) 

DE (hNotch4) . 

GN NOTCH4 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=960 6; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM 1), AND POLYMORPHISM OF POLY-LEU. 

RC TISSUE=Placenta; 

RX MEDLINE=97 311416; PubMed=916 8133 ; 

RA Sugaya K., Sasanuma S.-I., Nohata J., Kimura T., Fukagawa T., 

RA Nakamura Y., Ando A., Inoko H., Ikemura T., Mita K. ; 

RT "Gene organization of human NOTCH4 and (CTG)n polymorphism in this 

RT human counterpart gene of mouse proto-oncogene Int3."; 

RL Gene 18 9:235-244(1997). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORMS 1; 2 AND 3) . 

RC TISSUE=Bone marrow, and Heart; 

RX MEDLINE=98360091; PubMed=9693032 ; 

RA Li L., Huang G.M., Banta A.B., Deng Y., Smith T., Dong P., 

RA Friedman C, Chen L., Trask B.J., Spies T., Rowen L., Hood L . ; 

RT "Cloning, characterization, and the complete 56 . 8-kilobase DNA 

RT sequence of the human NOTCH4 gene." ; 

RL Genomics 51:45-58(1998). 

RN [3] 

RP SEQUENCE OF 1-503 FROM N.A., AND VARIANTS GLN-117 AND GLN-317. 



RA Miyagawa T . , Tokunaga K. , Hojho H. ; 

RT "Human notch4 gene variant."; 

RL Submitted (FEB-1999) to the EMBL/GenBank/DDB J databases. 

RN [4] 

RP IDENTIFICATION OF LIGANDS . 

RX MEDLINE=9918 07 65; PubMed-10 07 9256; 

RA Gray G.E., Mann R.S., Mitsiadis E . , Henrique D., Carcangiu M.-L., 

RA Banks A., Leiman J., Ward D., Ish-Horowitz D. , Artavanis-Tsakonas S.; 

RT "Human ligands of the Notch receptor."; 

RL Am. J. Pathol. 154:785-7 94(1999). 

cc _i_ FUNCTION: Functions as a receptor for membrane-bound ligands 

CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP-J kappa and activates genes of the enhancer of split locus. 

CC Affects the implementation of differentiation, proliferation and 

CC apoptotic programs. May regulate branching morphogenesis in the 

CC developing vascular system (By similarity) . 

CC -!- SUBUNIT: Heterodimer of a C-terminal fragment N(TM) and a N- 

CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds (By similarity) . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 

CC proteolytical processing NICD is translocated to the nucleus. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=3; 

CC Comment=Experimental confirmation may be lacking for some 

CC isoforms; 

CC Name=l; 

CC IsoId=Q994 66-l; Sequence-Displayed; 

CC Name =2 ; 

CC IsoId=Q99466-2; Sequence=VSP_0 014 06 ; 

CC Name=3; 

CC IsoId=Q99466-3; Sequence=VSP_0 014 07 ; 

Cc -!- TISSUE SPECIFICITY: Highly expressed in the heart, moderately in 

CC the lung and placenta and at low levels in the liver, skeletal 

CC muscle, kidney, pancreas, spleen, lymph node, thymus, bone marrow 

CC and fetal liver. No expression was seen in adult brain or 

CC peripheral blood leukocytes. 

CC -!- PTM: Synthesized in the endoplasmic reticulum as an inactive form 

CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-accessible form. Cleavage results in a C- 

CC terminal fragment N(TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNB'-alpha converting enzyme 

CC (TACE) to yield a membrane-associated intermediate fragment called 

CC notch extracellular truncation (NEXT) . This fragment is then 

CC cleaved by presenilin dependent gamma-secretase to release a 

CC notch-derived peptide containing the intracellular domain (NICD) 

CC from the membrane (By similarity) . 

CC -!- PTM: Phosphorylated (By similarity). 

CC -!- POLYMORPHISM: The poly-Leu region of N0TCH4 (in the signal 

CC peptide) is polymorphic and the number of Leu varies in the 

CC population (from 6 to 12). 

CC SIMILARITY: Belongs to the NOTCH family. 

CC -!- SIMILARITY: Contains 28 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 Lin/Notch repeats. 

CC -!- SIMILARITY: Contains 5 ANK repeats. 



CC -!- CAUTION: Ref.l sequence differs from that shown due to frameshifts 
CC in position 1438 to 1463. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; D63395; BAA09708.1; ALT_FRAME. 

DR EMBL; D86566; BAA13116.1; -. 

DR EMBL; U95299; AAC32288.1; -. 

DR EMBL; U89335; AAC63097.1; -. 

DR EMBL; AB023961; BAB20317.1; -. 

DR EMBL; AB024520; BAA88951.1; 

DR EMBL; AB024578; BAA88952.1; -. 

DR HSSP; P08709; 1BF9 . 

DR Genew; HGNC:78 84; N0TCH4 . 

DR MIM; 164951; 

DR InterPro; IPR002110; ANK. 

DR InterPro; IPR000152; Asx__hydroxyl_S . 

DR InterPro; IPR000742; EGF_2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR001438; EGF_II. 

DR InterPro; IPR006209; EGF__like. 

DR InterPro; IPR002049; Laminin^EGF. 

DR InterPro; IPR008297; Notch. 

DR InterPro; IPR000800; Notch_dom. 

DR Pfam; PF00023; ank; 6. 

DR Pfam; PF00008; EGF; 26. 

DR Pfam; PF00066; notch; 2. 

DR PIRSF; PIRSF002279; Notch; 1. 

DR PRINTS; PR00010; EGFBLOOD . 

DR PRINTS; PR00011; EGFLAMININ. 

DR PRINTS; PR01452; NOTCH. 

DR SMART; SM00248; ANK; 5. 

DR SMART; SM0 017 9; EGF_CA; 11. 

DR SMART; SM00004; NL; 3. 

DR PROSITE; PS50297; ANKJREP_REGION; 1. 

DR PROSITE; PS50088; ANK_REPEAT; 5. 

DR PROSITE; PS00010; ASX_HYDROXYL; 11. 

DR PROSITE; PS00022; EGF_1; 28. 

DR PROSITE; PS01186; EGF^_2 ; 21. 

DR PROSITE; PS50026; EGF_3 ; 28. 

DR PROSITE; PS01187; EGF_CA; 9. 

KW Receptor; Transcription regulation; Activator; Differentiation; 

KW Developmental protein; Repeat; ANK repeat; EGF-like domain; 

KW Transmembrane; Glycoprotein; Signal; Phosphorylation; Polymorphism; 

KW Triplet repeat expansion; Alternative splicing. 

FT SIGNAL 1 23 POTENTIAL. 

FT CHAIN 24 2003 NEUROGENIC LOCUS NOTCH PROTEIN HOMOLOG 4. 

FT CHAIN 1432 2003 NOTCH EXTRACELLULAR TRUNCATION 

FT (BY SIMILARITY) . 

FT CHAIN 1467 2003 NOTCH INTRACELLULAR DOMAIN 

FT (BY SIMILARITY) . 
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Query Match 14 
Best Local Similarity 26 



EXTRACELLULAR (POTENTIAL) . 
POTENTIAL. 

CYTOPLASMIC (POTENTIAL) . 
EGF-LIKE 1. 
EGF-LIKE 2. 
EGF-LIKE 3. 
EGF-LIKE 4. 

EGF-LIKE 5, CALCIUM- BINDING (POTENTIAL) . 
EGF-LIKE 6. 

EGF-LIKE 7, CALCIUM-BINDING (POTENTIAL) . 
EGF-LIKE 8, CALCIUM-BINDING (POTENTIAL) . 
EGF-LIKE 9, CALCIUM-BINDING (POTENTIAL) . 
EGF-LIKE 10. 

EGF-LIKE 11, CALCIUM-BINDING (POTENTIAL) 

EGF-LIKE 12, CALCIUM-BINDING (POTENTIAL) 

EGF-LIKE 13, CALCIUM-BINDING (POTENTIAL) 

EGF-LIKE 14, CALCIUM-BINDING (POTENTIAL) 

EGF-LIKE 15, CALCIUM-BINDING (POTENTIAL) 

EGF-LIKE 16. 

EGF-LIKE 17. 

EGF-LIKE 18. 

EGF-LIKE 19. 

EGF-LIKE 20. 

EGF-LIKE 21. 

EGF-LIKE 22. 

EGF-LIKE 23. 

EGF-LIKE 24. 

EGF-LIKE 25. 

EGF-LIKE 26. 

EGF-LIKE 27. 

EGF-LIKE 28. 

EGF-LIKE 29. 

POLY-ARG. 

LIN/NOTCH 1. 

LIN/NOTCH 2. 

LIN/NOTCH 3. 

ANK 1. 

ANK 2. 

ANK 3. 

ANK 4. 

ANK 5. 

BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 

2%; Score 959.5; DB 1; Length 2003; 
5%; Pred. No. 7.4e-50; 



Matches 298; Conservative 64; Mismatches 324; Indels 437; Gaps 



Qy 94 CCPGFYESGEMCVPHCADKC VHGRCIAPNT CQCEPGWGGTNCSSACDGDH 143 

I I i I : I I I II IN : I I I I I I i 

Db 105 CLPGF — TGERCQAKLEDPCPPSFCSKRGRCHIQASGRPQCSCMPGWTGEQCQLR 157 

Qy 144 WGPHCTSRCQCKNGALCNPITG— ACHCAAGFRGWRCE DRCEQG TYGNDCHQ- 193 

I : : I I I : I I I I II I I I : I I III 

Db 158 — DFCSAN-PCVNGGVCLATYPQIQCHCPPGFEGHACERDVNECFQDPGPCPKGTSCHNT 214 

Qy 194 RCQ CQNGATC DHVTGECRCPPGYTGAFCE 222 

II: I I I I t I : I I I 

Db 215 LGSFQCLCPVGQEGPRCELRAGPCPPRGCSNGGTCQLMPEKDSTFHLCLCPPGFIGPDCE 274 

Qy 223 DLCPPGKHG PQCEQRCP--CQNGGVCHH 248 

ill I : II : I I : I I I I : 

Db 275 VNPDNCVSHQCQNGGTCQDGLDTYTCLCPETWTGWDCSEDVDECETQGPPHCRNGGTCQN 334 

Qy 249 VTG--ECSCPSGWMGTVCGQP CPEGRFGKNCSQE- 280 

MM: II 

Db 335 SAGSFHCVCVSGWGGTSCEENLDDCIAATCAPGSTCIDRVGSFSCLCPPGRTGLLCHLED 394 

Qy 281 -C QCHNGGTC--DAATGQ— CHCSPGYTGERCQ DECPVGTYGVLCAETCQCVNG 32 9 

| || I : I I 11111:11 I M : I Ml 
Db 395 MCLSQPCHGDAQCSTNPLTGSTLCLCQPGYSGPTCHQDLDECLMAQQG PSPCEHG 449 

Qy 330 GKCYHVSGA— CLCEAGFAGERCEAR LCPEGLYGIK 363 

II : I : I I I I : II I I I I I I I II 

Db 450 GSCLNTPGSFNCLCPPGYTGSRCEADHNECLSQPCHPGSTCLDLLATFHCLCPPGLEGQL 509 

Qy 364 CD KRC PCHLENTHSCHPMSG — ECACKPGWSGLYCNE 398 

I : I II I I M : I M I : I I M 

Db 510 CEVETNECASAPC--LNHADCHDLLNGFQCICLPGFSGTRCEEDIDECRSSPC7\NGGQCQ 567 

Q y 399 TCSPGFYGEACQ QIC SCQNGADCDSVTGK--CTCAPGFKGI DCSTP 442 

II II I I I I I Ml : I II III I I 

Db 568 DQPGAFHCKCLPGFEGPRCQTEVDECLSDPCPVGASCLDLPGAFFCLCPSGFTGQLCEVP 627 

Qy 443 CPLGTYGI NCSSRCG-CKNDAVCSPVDGSCTC 473 

II I M M : || : III 
Db 628 LCAPNLCQPKQICKDQKDKANCLCPDGSPGCAPPEDNCTCHHGHCQR SSCVC 679 

Qy 474 KAGWHGVDC SIRCPSGTWGFGCN LTCQ CL 502 

M M I : M : I I I : I II 

Db 680 DVGWTGPECEAELGGCISAPCAHGGTCYPQPSGYNCTCPTGYTGPTCSEEMTACHSGPCL 739 

Qy 503 NGGACNTLDG--TCTCAPGWRGEKCE LPC QDGTYGLNCA 539 

I I I : I I i I I II Ml: II Ml: II 

Db 740 NGGSCNPSPGGYYCTCPPSHTGPQCQTSTDYCVSAPCFNGGTCVNRPGTFSCLCAMGFQG 7 99 

Qy 540 ERCD CSHADGCH— PTTGHCRCLPGWSGVHCDS 570 

II: M I I II I : : I M 

Db 800 PRCEGKLRPSCADSPCRNRATCQDSPQGPRCLCPTGYTGGSCQTLMDLCAQKPCPRNSHC 859 

Qy 571 VCAEGRWGPNCSLP CYCKNGASCS PDDG ICEC 602 

: M I I I I : I I II I I M M 

Db 860 LQTGPSFHCLCLQGWTGPLCNLPLSSCQKAALSQGIDVSSLCHNGGLC-VDSGPSYFCHC 918 



Qy 603 APGFRGTTCQR ICSPGFYGHRCSQTCPQCVHSSGP 637 

| M : | : | I I : I I : | | | : III 

Db 919 PPGFQGSLCQDHVNPCESRPCQNGATCMAQPSGYLCQCAPGYDGQNCSKELDAC— QSQP 976 

Qy 638 CHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAG ICTCTNNGTCNPIDRS 687 

I I : I I I I I I I I I i I I II I : : : 

D b 977 CHN-HGTCTPKPG--GFHC--ACPPGFVGLRCEGDVDECLDQPCHPTGTAACHSLANAFY 1031 

Qy 688 CQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGE CKCTPGWTGLYC 742 

I I I I I I I III I : I I I I I I I : I I 

Db 1032 CQCLPGHTGQWCEVEIDPCHSQP CFHGGTCEATAGSPLGFICHCPKGFEGPTC 1084 

Qy 743 TQRCP-LGFYGKDCALICQCQNGADC DHISGQCTCRTGFMGRHC-EQKCPSGTYG 7 95 

: I I I I : I : I I : | | : | : | | I 
Db 1085 SHRAPSCGFH HCHHGGLCLPSPKPGFPPRCACLSGYGGPDCLTPPAPK 1132 

Qy 796 YGCRQICDCLNNSTCDHITG TCYCSPGWKGARCDQAG 832 

II 111:111 II I I I : I 

Db 1133 -GCGPPSPCLYNGSCSETTGLGGPGFRCSCPHSSPGPRCQKPG 1174 



RESULT 14 
FBP1_STRPU 

ID FBP1J3TRPU STANDARD; PRT; 1064 AA. 

AC P10079; 

DT 01-MAR-1989 (Rel. 10, Created) 

DT 01-FEB-1996 (Rel. 33, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Fibropellin I precursor (Epidermal growth factor-related protein 1) 

DE (UEGF-1) . 

GN EGF1 . 

OS Strongylocentrotus purpuratus (Purple sea urchin) . 

OC Eukaryota; Metazoa; Echinodermata ; Eleutherozoa ; Echinozoa; 

OC Echinoidea; Euechinoidea ; Echinacea; Echinoida; St rongylocentrotidae ; 

OC Strongylocentrotus . 

OX NCBI_TaxID=7668 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=9 0 112 459; PubMed-2 514273; 

RA Delgadillo-Reynoso M.G., Rollo D.R., Hursh D.A. , Raff R.A. ; 

RT "Structural analysis of the uEGF gene in the sea urchin 

RT strongylocentrotus purpuratus reveals more similarity to vertebrate 

RT than to invertebrate genes with EGF-like repeats."; 

RL J. Mol. Evol. 29:314-327(1989). 

RN [2] 

RP SEQUENCE OF 279-476 AND 781-1064 FROM N.A. 

RX MEDLINE=87319677; PubMed=34 982 16 ; 

RA Hursh D.A., Andrews M.E., Raff R.A. ; 

RT "A sea urchin gene encodes a polypeptide homologous to epidermal 

RT growth factor."; 

RL Science 2 37:1487-1490(1987). 

RN [3] 

RP AVIDIN-LIKE DOMAIN. 

RX MEDLINE- 89196806; PubMed=2 7 8 4 7 7 3; 

RA Hunt L.T., Barker W.C.; 

RT "Avidin-like domain in an epidermal growth factor homolog from a sea 



RT urchin."; 

RL FASEB J. 3:1760-1764(1989). 

RN [4] 

RP CHARACT ERI Z AT I ON . 

RX MEDLINE=91285254; PubMed=2 0 60 7 1 4 ; 

RA Bisgrove B.W., Andrews M.E., Raff R.A. ; 

RT "Fibropellins, products of an EGF repeat-containing gene, form a 

RT unique extracellular matrix structure that surrounds the sea urchin 

RT embryo."; 

RL Dev. Biol. 146:89-99(1991). 

CC FUNCTION: Form the apical lamina, a component of the extracellular 

CC matrix. 

CC -!- SUBCELLULAR LOCATION: EXTRACELLULAR. IN VESICLES IN THE CYTOPLASM 

CC OF UNFERTILIZED EGGS r THEN TO THE BASE OF THE HYALIN LAYER 

CC THROUGHOUT DEVELOPMENT AND FINALLY IN THE APICAL LAMINA IN LATE 

CC EMBRYOS AND EARLY LARVAE. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=IA; 

CC IsoId=P1007 9-l ; Sequence=Di splayed; 

CC Name-IB; 

CC IsoId=P1007 9-2; Sequence=VSP_0 0 04 51 ; 

CC -!- DEVELOPMENTAL STAGE: Moderate levels in unfertilized eggs and 

CC during early cleavage, then rapidly increases in abundance between 

CC late morula and mesenchyme blastula stages to maximal levels 

CC maintained through subsequent stages. Expressed both maternally 

CC and zygotically. 

CC -!- SIMILARITY: Contains 21 EGF-like domains. 

CC -!- SIMILARITY: Contains 1 CUB domain. 

CC -!- SIMILARITY: THE C-TERMINAL DOMAIN OF THIS PROTEIN IS SIMILAR TO 
CC AVIDIN/ STREPTAVIDIN . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch ) . 

CC 

DR EMBL; L08692; AAA62164.1; 

DR EMBL; L08692; AAA62163.1; -. 

DR EMBL; X17530; CAA35571.1; -. 

DR EMBL; M17421; AAA30050.1; 

DR EMBL; X17533; CAA35573.1; -. 

DR PIR; A40136; A40136. 

DR HSSP; P01132; 1EGF. 

DR InterPro; IPR000152; Asx_hydroxyl_S . 

DR InterPro; IPR005469; Avidin. 

DR InterPro; IPR005468; Avidin/str. 

DR InterPro; IPR000859; CUB. 

DR InterPro; IPR000742; EGF__2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR001438; EGF_II . 

DR InterPro; IPR006209; EGF_like. 

DR Pfam; PF01382; Avidin; 1. 

DR Pfam; PF00431; CUB; 1. 



DR 


Pfam; PF00008; EGF; 21. 










DR 


PRINTS; PR00709; 


AVIDIN. 










DR 


PRINTS; PR00010; 


EGFBLOOD 










DR 


SMART; SM00042; CUB; 1. 










DR 


SMART; SM0 017 9; EGF_CA; 20. 








DR 


PROSITE; 


PS00010; 


ASX HYDROXYL; 19. 






DR 


PROSITE; 


PS00577; 


AVIDIN; 


1. 








DR 


PROSITE; 


PS01180; 


CUB; 1. 










DR 


PROSITE; 


PS00022; 


EGFJL; 


19. 








DR 


PROSITE; 


PS01186; 


EGF_2 ; 


19. 








DR 


PROSITE; 


PS50026; 


EGF_3; 


21. 








DR 


PROSITE ; 


PS01187; 


EGF CA; 


18. 








KW 


Biotin; Alternative splicing; 


EGF-like 


domain; Repeat; Signal; 


rvvV 


Glycoprotein; Calcium-binding. 
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Ul JU XL XX/ 


772 


781 


FT 

X X 


DT STTT.FTD 

XJ X O U Xj X 1U 


788 


.799 


FT 

X X 


DT SITT.FT D 

X'XOUXjX X xj 


7 93 


808 


FT 
£ X 


DT STTT.FTD 

XJ X O U Xl JL X u 


810 


819 


FT 
x X 


DT STIT.FT D 


82 6 


8 37 


FT 

x X 


DT STIT.FT D 

XJ X O U XJ X x xj 


831 


846 


FT 

X 1 


DT STTT.FTD 

XJ X O U XJ X X u 


848 


857 


FT 

X X 


DT STTT.FTD 

U 1 J U U L XX* 


864 


875 


FT 

X X 


DT SIIT.FTD 

XJ X 0 U Xj X 1 u 


869 


884 


FT 
£ X 


DT STTT.FTD 

JJ 1 O U lJ 1 XX/ 


886 


895 


FT 

X 1 


DT SIIT.FTD 

UX J U XI XX* 


902 


913 


FT 


DTSULFID 


907 


922 


FT 

X X 


DT SIIT.FTD 

J_y J_ >_J U _L_I 1_ _L XJ 


924 


933 


FT 

X X 


UrVi\DUii x xj 


30 


30 


FT 
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136 
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FT 


VARSPLIC 
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N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
Missing (in isoform IB) . 
/FTId=VSP_0 0 04 51 . 
L -> S (IN REF. 2) . 
MW; 2E569CA012ED6D09 CRC64; 



Query Match 14.2%; Score 954.5; DB 1; Length 1064; 

Best Local Similarity 28.0%; Pred. No. 7.5e-50; 

Matches 290; Conservative 93; Mismatches 305; Indels 347; Gaps 69 



Qy 30 DPNVCSHWESYSVTVQESYPHPFDQI YYTSCTDILNWFKCTRHRVSYRTAYRHGEKTMYR 89 

I I | : | : : I I I :: I : I I 

Db 181 DPNLCQNG AACTDLVNDYACT 201 

Qy 90 RKSQCCPGFYESGEMC VPHCA-DKCVHGRCIAPN TCQCEPGWGGTNCS SACDG 141 

I I I I : I I : I I I I : I | | || : | I : : 

Db 2 02 CPPGF— TGRNCEIDIDECASDPCQNGGACVDGVNGYVCNCVPGFDGDECENNIN- 2 54 

Qy 142 DHWGPHCTSRCQCKNGALCNPITGA CHCAAGFRGWRCE— DRCEQGTYGNDCHQR 194 

II | | | : | : I I I I I I I I I I M 

Db 255 ECAS-SPCLNGGIC— VDGVNMFECTCLAGFTGVRCEVNIDECAS 296 

Q y 195 CQCQNGATC-DHVTG-ECRCPPGYTGAFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGE 252 

I I I I I : I i I i I : : I I I : : : I I I I I I I : 

Db 297 APCQNGGICIDGINGYTCSCPLGFSGDNCEN NDDECSS-IPCLNGGTCVDLVNA 349 

Qy 253 — CSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGTC-DAATG-QCHCSPGYTGERCQ 308 

| | I : I I I I I I I I ! : 

Db 350 YMC VC AP G W T G P T C ADN IDE CA-SAPCQNGGVCIDGVNGYMCDCQPGYTGTHCE 402 

Qy 309 DECPVGTYGVLCAETCQCVNGGKCYH-VSG-ACLCEAGFAGERCEARLCPEGLYGIK 363 

| M 1111 I I : I I : I I I I I I 

403 TDIDEC ARPPCQNGGDCVDGVNGYVCICAPGFDGLNCE 440 



Db 

Qy 

Db 



Qy 



Qy 

Db 



364 CDKRCPCHLENTHSC— HPMSGECACKPGWSGLYCNETCSPGFYGEACQ— QICS — C 415 

| | | I I : I I I I I I : I I : I = I 

441 NN I DECAS RP CQNGAVCVDGVNG FVC - - T C S AG YT GVL CET D I N ECASMP C 489 



Q y 416 QNGADC-DSVTGK-CTCAPGFKGIDCSTPCPLGTYGINCSSRCGCKNDAVCS-PVDG-SC 471 

|| I I I I I I I I I I : I : I I hi hill: hi I 

Db 4 90 LNGGVCTDLVNGYICTCAAGFEGTNCETDTD ECAS-FPCQNGATCTDQVNGYVC 542 



472 TCKAGWHGVDC SIRCPSG TWGFGCNL TCQ 500 

M hill I I : I hi I h 

Db 543 TCVPGYTGVLCETDINECASFPCLNGGTCNDQVNGYVCVCAQDTSVSTCETDRDECASAP 602 

Q y 5 01 CLNGGAC-NTLDG-TCTCAPGWRGEKCEL PCQDGTYGLNCAERCDCSHADGC 5 50 

INN!::: Ill III I I h I I : I I I I : : : I : 
Db 603 CLNGGACMDWNGFVCTCLPGWEGTNCEINTDECASSPCMNG— GL-CVDQVN-SYV 655 

Qy 551 HPTTGHCRCLPGWSGVHCDSVCAEGRWGPNCSLPCYCKNGASC— SPDDGICECAPGFRG 608 

I I I I I : : h I I : I 1 I II I I tllh 

Db 656 C FCL P GFTGIHCGT EI DECAS SP CLNGGQCIDRVDS YECVCAAGYTA 702 

q v 609 TTCQ RICSPGFYGHRCSQTCPQCVHSSGPC 638 

|| | : | | : I i : I : I I I 

Db 7 03 VRCQINI DECAS APCQNGGVCVDGVNGYVCN CAP GYTGDNCET EI DEC- -ASMPCLNGGA 7 60 

639 --HHITG-LCDCLPGFTGALC NEVCPSGRFGKNCAGICTCTNNGTCNPIDRS 687 

: | | | : |: I I : I : I : I h I I I I I 
7 61 CI EMVNGYTCQCVAGYTGVI CETDI DECASAPCQNG GVCTDTINGYI 8 07 

Qy 68 8 CQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDG ECKCTPGWTGLYCT 743 

| | | | : | | : | I I I I I M h I I : : I M 

Db 808 CACVPGFTGSNCETNIDECASDP CLNGGIC— VDGVNGFVCQCPPNYSGTYCE 858 

Qy 744 QRCPLGFYGKDCALICQCQNGADCDHISGQ-- CTCRTGFMGRHCE— QKCPSGTYGYGC 798 



2 



Db 


859 


Qy 


799 


Db 


905 




8 4 9 


Db 


962 


Qy 


909 


Db 


996 



HIM | :: I I I: :| I 
> r hm cat r\/\Tvr; a n YVC E CVP G YAGON C F, T D I N E CAS 9 04 



|| I I I I I I I I = I I : : I : = : : : : I I 



: I : : I : I : II II 
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RESULT 15 
NTC4_MOUSE 

ID NTC4_MOUSE STANDARD; PRT; 1964 AA. 

AC P31695; 035442; 088314; 088316; Q62389; Q62390; Q9R1W9; Q9R1X0; 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Neurogenic locus notch homolog protein 4 precursor (Notch 4) 

DE [Contains: Transforming protein Int-3] . 

GN N0TCH4 OR INT3 OR INT-3. 

OS Mus raus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=92194507; PubMed=1312 643 ; 

RA Robbins J., Blondel B.J., Gallahan D., Callahan R. ; 

RT "Mouse mammary tumor gene int-3: a member of the notch gene family 

RT transforms mammary epithelial cells."; 

RL J. Virol. 66:2594-2599(1992). 
RN [2] 

RP REVISIONS, SEQUENCE FROM N.A. 

RX MEDLINE=97294599; PubMed-9150355 ; 

RA Gallahan D. , Callahan R. ; 

RT "The mouse mammary tumor associated gene INT3 is a unique member of 

RT the NOTCH gene family (N0TCH4 ) . " ; 

RL Oncogene 14:1883-1890(1997). 
RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Lung, and Testis; 

RX MEDLINE-9 62 8166 8; PubMed=8 6818 05 ; 

RA Uyttendaele H., Marazzi G., Wu G., Yan Q., Sassoon D., Kitajewski J.; 
RT "Notch4/int-3, a mammary proto-oncogene, is an endothelial 
RT cell-specific mammalian Notch gene."; 
RL Development 122:22 51-2259(1996). 
RN [4] 

RP SEQUENCE FROM N.A. 

RA Rowen L., Mahairas G. , Qin S., Ahearn M.E., Dankers C, Lasky S., 
RA Loretz C, Schmidt S., Tipton S . , Traicoff R. , Zackrone K. , Hood L.; 
RT "Sequence of the mouse major histocompatibility locus class III 



RT region."; 

RL Submitted (OCT-1997) to the EMBL/ GenBank/DDBJ databases. 

RN [5] 

RP SEQUENCE OF 1436-1600 FROM N . A. 

RX MEDLINE=99252212; PubMed-1 02 33 9 82 ; 

RA Lee J.-S., Haruna T., Ishimoto A., Honjo T . , Yanagawa S.-I.; 

RT "Intracisternal type A particle-mediated activation of the Notch4/int3 

RT gene in a mouse mammary tumor: generation of truncated Notch4/int3 

RT mRNAs by retroviral splicing events."; 

RL J. Virol. 73:5166-5171(1999). 

RN [6] 

RP FUNCTION. 

RX MEDLINE-21244657; PubMed-1 134 4 3 05 ; 

RA Uyttendaele H. , Ho J . , Rossant J . , Kitajewski J.; 

RT "Vascular patterning defects associated with expression of activated 

RT Notch4 in embryonic endothelium."; 

RL Proc. Natl. Acad. Sci . U.S.A. 98:5643-5648(2001). 

RN [7] 

RP SEQUENCE OF 1463-1964, POST-TRANSLATIONAL PROCESSING, AND MUTAGENESIS 

RP OF VAL-1463. 

RX MEDLINE=21523956; PubMed=115 1 8 7 1 8 ; 

RA Saxena M.T., Schroeter E.H., Mumm J.S., Kopan R. ; 

RT "Murine notch homologs (Nl-4) undergo presenilin-dependent 

RT proteolysis . " ; 

RL J. Biol. Chem. 27 6:40268-40273(2001). 

RN [8] 

RP POST-TRANSLATIONAL PROCESSING. 

RX MEDLINE=21374376; PubMed=1145994 1 ; 

RA Mizutani T., Taniguchi Y., Aoki T., Hashimoto N . , Honjo T . ; 

RT "Conservation of the biochemical mechanisms of signal transduction 

RT among mammalian Notch family members."; 

RL Proc. Natl. Acad. Sci. U.S.A. 98:902 6-9031(2001). 

CC -!- FUNCTION: Functions as a receptor for membrane-bound ligands 

CC Jaggedl, Jagged2 and Deltal to regulate cell-fate determination. 

CC Upon ligand activation through the released notch intracellular 

CC domain (NICD) it forms a transcriptional activator complex with 

CC RBP-J kappa and activates genes of the enhancer of split locus. 

CC Affects the implementation of differentiation, proliferation and 

CC apoptotic programs (By similarity) . May regulate branching 

CC morphogenesis in the developing vascular system. 

CC -!- SUBUNIT: Heterodimer of a C-terminal fragment N(TM) and a N- 

CC terminal fragment N(EC) which are probably linked by disulfide 

CC bonds . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. Following 

CC proteolytical processing NICD is translocated to the nucleus. 

CC -!- TISSUE SPECIFICITY: Highly expressed in lung, moderately in heart 

CC kidney, and at lower levels in the ovary and skeletal muscle. A 

CC very low expression is seen in the brain, intestine, liver and 

CC testis. 

CC DEVELOPMENTAL STAGE: Highly expressed in endothelial cells during 

CC embryonic development from 9.0 dpc. 

CC -!- PTM : Synthesized in the endoplasmic reticulum as an inactive form 
CC which is proteolytically cleaved by a furin-like convertase in the 

CC trans-Golgi network before it reaches the plasma membrane to yield 

CC an active, ligand-acces sible form. Cleavage results in a C- 

CC terminal fragment N(TM) and a N-terminal fragment N(EC). Following 

CC ligand binding, it is cleaved by TNF-alpha converting enzyme 



cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

cc 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 



(TACE) to yield a membrane-associated intermediate fragment called 
notch extracellular truncation (NEXT). This fragment is then 
cleaved by presenilin dependent gamma-secretase to release a 
notch-derived peptide containing the intracellular domain (NICD) 
from the membrane. 
PTM: Phosphorylated. 

DISEASE: Loss of the extracellular domain causes constitutive 
activation of the Notch protein, which leads to hyperprolif eration 
of glandular epithelial tissues and development of mammary 
carcinomas . 

SIMILARITY: Belongs to the NOTCH family. 
SIMILARITY: Contains 29 EGF-like domains. 
SIMILARITY: Contains 3 Lin/Notch repeats. 
SIMILARITY: Contains 5 ANK repeats. 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is m no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; M80456; AAB38377.1; 

EMBL; U43691; AAC52630.1; 

EMBL; U43691; AAC52631.1; -. 

EMBL; AF030001; AAB82004.1; -. 

EMBL; AB016771; BAA32281.1; ALTJ3EQ. 

EMBL; AB016772; BAA322 8 3.1; ALT_INIT . 

EMBL; AB016773; BAA32284.1; ALT_INIT . 

EMBL; AB016774; BAA32285.1; -. 

PIR; A38072; TVMVT3 . 

PIR; T09059; T09059. 

HSSP; P08709; 1BF9 . 

MGD; MGI: 107471; Notch4 . 

InterPro; IPR002110; ANK. 

InterPro; IPR000152; Asx_hydroxyl_S . 

InterPro; IPR000742; EGF_2 . 

InterPro; IPR001881; EGF^Ca . 

InterPro; IPR001438; EGF_II. 

InterPro; IPR006209; EGFj_ike. 

InterPro; IPR002049; Laminin_EGF. 

InterPro; IPR008297; Notch. 

InterPro; IPR000800; Notch_dom. 

Pfam; PF00023; ank; 6. 

Pfam; PF00008; EGF; 27. 

Pfam; PF00066; notch; 2. 

PIRSF; PIRSF002279; Notch; 1. 

PRINTS; PR00010; EGFBLOOD . 

PRINTS; PR00011; EGFLAMININ. 

PRINTS; PR01452; NOTCH. 

SMART; SM00248; ANK; 6. 

SMART; SM0017 9; EGF^CA; 11. 

SMART; SM00004; NL; 2. 

PROSITE; PS50297; ANK_REP_REGION ; 1. 
PROSITE; PS50088; ANK_REPEAT; 5. 
PROSITE; PS00010; ASX_HYDROXYL; 11. 
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CALCIUM-BINDING (POTENTIAL) 



La t ion; Activator; Differentiation; 
t; ANK repeat; EGF-like domain; 

Signal ; Phosphorylation; Proto-oncogene . 

POTENTIAL. 

NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 4 

TRANSFORMING PROTEIN INT-3 . 

NOTCH EXTRACELLULAR TRUNCATION. 

NOTCH INTRACELLULAR DOMAIN. 

EXTRACELLULAR (POTENTIAL). 

POTENTIAL. 

CYTOPLASMIC (POTENTIAL) . 
EGF-LIKE 1. 
EGF-LIKE 2. 
EGF-LIKE 3. 
EGF-LIKE 4. 
EGF-LIKE 5, 
EGF-LIKE 6. 
EGF-LIKE 7, 
EGF-LIKE 8, 
EGF-LIKE 9, 
EGF-LIKE 10. 
EGF-LIKE 11, 
EGF-LIKE 12, 
EGF-LIKE 13, 
EGF-LIKE 14, 
EGF-LIKE 15, 
EGF-LIKE 16. 
EGF-LIKE 17. 
EGF-LIKE 18. 

EGF-LIKE 19. 

EGF-LIKE 20. 

EGF-LIKE 21. 

EGF-LIKE 22. 

EGF-LIKE 23. 

EGF-LIKE 24. 

EGF-LIKE 25. 

EGF-LIKE 26. 

EGF-LIKE 27. 

EGF-LIKE 28. 

EGF-LIKE 29. 

LIN/NOTCH 1. 

LIN/NOTCH 2. 

LIN/NOTCH 3. 



CALCIUM-BINDING 
CALCIUM-BINDING 
CALCIUM-BINDING 

CALCIUM-BINDING 
CALCIUM-BINDING 
CALCIUM-BINDING 
CALCIUM-BINDING 
CALCIUM-BINDING 



(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 

(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 
( POTENTIAL) 
(POTENTIAL) 



ir3eSL LOCdl OXILLXXclX. J_ ^-l.su, ~- . 

Matches 321; Conservative 75; Mismatches 348; Indels 545; Gaps 70; 



Qy 

Db 

Qy 



95 CP-GFYESGEMCVPHCADKCV HGRCIAPNT CQCEPGWGGTNCS SACDGDH 143 

|| |! :|: I I : I I I = I INN ' I 

102 CPSGF — TGDRCQTHLEELCPPSFCSNGGHCYVQASGRPQCSCEPGWTGEQCQLR 



144 WGPHCTSRCQCKNGALCNPITG— ACHCAAGFRGWRCE- 



-- 154 
DRCEQGTYGNDC 191 



I:: I II :| I I II I II ' ' ' ' { 

Db 155 — DFCSAN-PCANGGVCLATYPQIQCRCPPGFEGHTCERDINECFLEPGPCPQGT— SC 20£ 



QY 



192 HQ RCQ CQNGATCD HVTGE-CRCPPGYTGA 219 

| : | : I I I I I M 11111:11 

Db 209 HNTLGSYQCLCPVGQEGPQCKLRKGACPPGSCLNGGTCQLVPEGHSTFHLCLCPPGFTGL 268 



0 220 FCE DLCPPGKHG PQCEQRCP-- CQNGGV 245 

|| I I I I : II I I I = I I I 

Db 269 DCEMNPDDCVRHQCQNGATCLDGLDTYTCLCPKTWKGWDCSEDIDECEARGPPRCRNGGT 328 



Qy 



246 CHHVTG--ECSCPSGWMGTVCGQP CPEGRFGKNCS 278 

I : I I I I II I I = I I I I I I 

Db 329 CQNTAGSFHCVCVSGWGGAGCEENLDDCAAATCAPGSTCIDRVGSFSCLCPPGRTGLLCH 388 



QY 



279 QE — C QCHNGGTC — DAATGQ — CHCSPGYTGERCQ DECPVGTYGVLCAETCQC 326 

|| || I : II I I I I I : I I Ml: I ' 
Db 389 LEDMCLSQPCHVNAQCSTNPLTGSTLCICQPGYSGSTCHQDLDECQMAQQG PSPC 443 



QY 



327 VNGGKCYHVSGA--CLCEAGFAGERCEAR LCPEGLY 360 

: M I : I : I M I : I I I I I 11 1 1 1 

Db 444 EHGGSCINTPGSFNCLCLPGYTGSRCEADHNECLSQPCHPGSTCLDLLATFHCLCPPGLE 503 



Qy 



361 GIKCD KRC PCHLENTHSCHPMSG--ECACKPGWSGLYCNE 398 

| | : I II I Ml: = I I I I : : I I : 

Db 504 GRLCEVEVNECTSNPC--LNQAACHDLLNGFQCLCLPGFTGARCEKDMDECSSTPCANGG 561 



Qy 



399 TCSPGFYGEACQQICS CQNGADCDSVTGK — CTCAPGFKGIDC 439 

| | | | | | : : III: I I I I I I I 

Db 562 RCRDQPGAFYCECLPGFEGPHCEKEVDECLSDPCPVGASCLDLPGAFFCLCRPGFTGQLC 621 

0 440 STP CPLGTYGI NCSSRCG 457 

| II I: I II I 

Db 622 EVPLCTPNMCQPGQQCQGQEHRAPCLCPDGSPGCVPAEDNCPCHHGHCQRSLCVCDEGWT 681 

458 CKNDAVCSPVDG — SCTCKAGWHGVDCS IRCPSGTW 491 

I : II : I I I I I : I : I I I I I 

Db 682 GPECETELGGCISTPCAHGGTCHPQPSGYNCTCPAGYMGLTCSEEVTACHSGPCLNGGSC 741 



Qy 



Qy 



492 GFGCN LTCQCLNGGACNTLDGT — CTCAPGWRGEKCE- 526 

| : | : : Mill I I I I I I I I I I 

Db 742 SIRPEGYSCTCLPSHTGRHCQTAVDHCVSASCLNGGTCVNKPGTFFCLCATGFQGLHCEE 801 



0v 527 LPCQDGTYGLNC AERCDCSHADGCHPT 553 

Ml I I I I I I : oci 

Db 802 KTNPSCADSPCRNKATCQDTPRGARCLCSPGYTGSSCQTLIDLCARKPCPHTARCLQSGP 861 

Qy 554 TGHCRCLPGWSGVHCD SVCAEGRWGP 579 

: I I I I I : I I I III 
Db 862 SFQCLCLQGWTGALCDFPLSCQMAAMSQGIEISGLCQNGGLCIDTGSSYFCRCPPGFQGK 921 

Qy 

Db 

Qy 



580 NCS — LPCY— -CKNGASCSPDDG— ICECAPGFRGTTCQRI 614 

I II I : I = : I I : I : I I I I : I I : : 

922 LCQDNMNPCEPNPCHHGSTCVPQPSGYVCQCAPGYEGQNCSKVLEACQSQPCHNHGTCTS 981 



515 CSPGFYGHRCSQTCPQCV HSSG— PCHHITG— LCDCLPGFTGALCN- 657 

| I I I I I I : I : I I I II: I I I I I I I I 



Db 982 RPGGFHCACPPGFVGLRCEGDVDECLDRPCHPSGTAACHSIANAFYCQCLPGHTGQRCEV 1041 

q 55 g E v CPSGRFGKNCA GICTCTNNGTCNPI 684 

| : Mill: I I I I I I I 

Db 1042 EMDLCQSQPCSNGGSCEITTGPPPGFTCHCPKGFEGPTCSHKALSCGIHHCHNGGLCLPS 1101 

0 685 DRS CQCYPGWIGSDCSQPCPPAHWGP— NCIHTCNCHNGAFCSAYDGECKCTPGW 737 

III: = 1 =11111 

1102 PKPGSPPLCACLSGFGGPDCLTPPAPPGCGPPSPCLH NGTCTETPGL 1148 



Db 

Qy 



738 --TGLYCTQRCPLGFYGKDCALICQCQNGADCDHI SGQCTCRTG 779 

I II II I II : I : I I I I 

Db 1149 GNPGFQCT— CPPDSPGPR CQRPGASGCEGRGGDGTCDAGCSGPGGDWDGGDCSLG 1202 



QY 



780 FMGRHCEQKCPSGTY GYGCR QIC-DCLNNS 808 

I I : I I Ml I I I : I 

Db 1203 VPDPWKGCPPHSQCWLLFRDGRCHPQCDSEECLFDGYDCEIPLTCIPAYDQYCRDHFHNG 1262 



0v 809 T CDH I T GT CYC S P GWKGARCDQAGVI I VGNLN S LS RT S TAL PAD S YQ I GAI AG I 862 

|: I II I I I = I MM MM Ml = 

Db 1263 HCEKGCNNAEC--GWDGGDCRPEGE DSEGRPSLALLWLRPPALDQQLLALARV 1314 

Qy 863 I ILVLWLFLLALFI I YRHKQKGKES SMP 891 

: I I I M: I M: I 

Db 1315 LSLTLRV GLWV — RKDSEGRNMVFP 1337 
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